WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 7 : 

C12N 15/31, C07K 14/315, 16/12, A61K 
31/70, 39/09, G01N 33/53, 33/68, C12Q 
1768 



A2 



(11) International Publication Number: WO 00/06738 

(43) International Publication Date: 10 February 2000 (10.02.00) 



(21) International Application Number: PCT/GB99/02452 

(22) International Filing Date: 27 July 1999 (27.07.99) 



(30) Priority Data: 

9816336.3 
60/125,329 



27 July 1998 (27.07.98) GB 
19 March 1999 (19.03.99) US 



(71) Applicant (for all designated States except US): MICROBIAL 

TECHNICS LIMITED [GB/GBj; 20 Trumpington Street, 
Cambridge CB2 1QA (GB). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): LE PAGE. Richard, 
William/Falla [GB/GB]; University of Cambridge, Dept. 
of Pathology, Tennis Court Road, Cambridge CB2 1QP 
(GB). WELLS, Jeremy, Mark [GB/GB]; Actinova Ltd., 
12 Pembroke Avenue, Denny End Industrial Centre, Wa- 
terbeech, Cambridge CB5 9PB (GB). HANNIFFY, Sean, 
Bosco [IE/GB]; University of Cambridge, Dept. of Pathol- 
ogy, Tennis Court Road, Cambridge CB2 1QP (GB). HANS- 
BRO, Philip, Michael [GB/GB]; University of Cambridge, 
Dept. of Pathology, Tennis Court Road, Cambridge CB2 
1QP (GB). 



(74) Agents: CHAPMAN, Paul, William et al.; Kilburn & Strode, 
20 Red Lion Street, London WC1R 4PJ (GB). 



(81) Designated States: CN, JP, US, European patent (AT, BE, CH, 
CY, DE, DK, ES, FI, FR, GB, GR, IE. IT, LU, MC, NL, 
PT, SE). 



Published 

Without international search report and to be republished 
upon receipt of that report. 



(54) Title: NUCLEIC ACIDS AND PROTEINS FROM STREPTOCOCCUS PNEUMONIAE 
(57) Abstract 

Novel proteins from Streptococcus pneumoniae are described, together with nucleic acid sequences encoding them. Their use in 
vaccines and in screening methods is also described. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


- Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


1L 


Israel 


MR 


Mauritania 


UG 


Uganda 


BV 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mex ico 


UZ 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZVV 


Zimbabwe 


CI 


Cote d'lvoire 


KP 


Democratic People's 


NZ 


New Zealand 






CM . 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






CZ 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







PCT/GB99/02452 

1 

NUCLEIC ACIDS AND PROTEINS FROM STREPTOCOCCUS PNEUMONIAE 

The present invention relates to proteins derived from Streptococcus pneumoniae, 
nucleic acid molecules encoding such proteins, the use of the nucleic acid and/or 
proteins as antigens/immunogens and in detection/diagnosis, as well as methods for 
5 screening the proteins/nucleic acid sequences as potential anti-microbial targets. 

Streptococcus pneumoniae, commonly referred to as the pneumococcus, is an 
important pathogenic organism. The continuing significance of Streptoccocus 
pneumoniae infections in relation to human disease in developing and developed 

10 countries has been authoritatively reviewed (Fiber, G.R., Science, 265: 1385-1387 
(1994)). That indicates that on a global scale this organism is believed to be the 
most common bacterial cause of acute respiratory infections, and is estimated to 
result in 1 million childhood deaths each year, mostly in developing countries 
(Stansfield, S.K., Pediatr. Infect. Dis., 6: 622 (1987)). In the USA it has been 

15 suggested (Breiman et al, Arch. Intern. Med., 150: 1401 (1990)) that the 
pneumococcus is still the most common cause of bacterial pneumonia, and that 
disease rates are particularly high in young children, in the elderly, and in patients 
with predisposing conditions such as asplenia, heart, lung and kidney disease, 
diabetes, alcoholism, or with immunosupressive disorders, especially AIDS. These 

20 groups are at higher risk of pneumococcal septicaemia and hence meningitis and 
therefore have a greater risk of dying from pneumococcal infection. The 
pneumococcus is also the leading cause of otitis media and sinusitis, which remain 
prevalent infections in children in developed countries, and which incur substantial 
costs. 

25 

The need for effective preventative strategies against pneumococcal infection is 
highlighted by the recent emergence of penicillin-resistant pneumococci. It has been 
reported that 6.6% of pneumoccal isolates in 13 US hospitals in 12 states were found 
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to be resistant to penicillin and some isolates were also resistant to other antibiotics 
including third generation cyclosporins (Schappert, S.M., Vital and Health Statistics 
of the Centres for Disease Control/National Centre for Health Statistics, 214:1 
(1992)). The rates of penicillin resistance can be higher (up to 20%) in some 
hospitals (Breiman et al 9 J. Am. Med. Assoc., 271: 1831 (1994)). Since the 
development of penicillin resistance among pneumococci is both recent and sudden, 
coming after decades during which penicillin remained an effective treatment, these 
findings are regarded as alarming. 

For the reasons given above, there are therefore compelling grounds for considering 
improvements in the means of preventing, controlling, diagnosing or treating 
pneumococcal diseases. 

Various approaches have been taken in order to provide vaccines for the prevention 
of pneumococcal infections. Difficulties arise for instance in view of the variety of 
serotypes (at least 90) based on the structure of the polysaccharide capsule 
surrounding the organism. Vaccines against individual serotypes are not effective 
against other serotypes and this means that vaccines must include polysaccharide 
antigens from a whole range of serotypes in order to be effective in a majority of 
cases. An additional problem arises because it has been found that the capsular 
polysaccharides (each of which determines the serotype and is the major protective 
antigen) when purified and used as a vaccine do not reliably induce protective 
antibody responses in children under two years of age, the age group which suffers 
the highest incidence of invasive pneumococcal infection and meningitis. 

A modification of the approach using capsule antigens relies on conjugating the 
polysaccharide to a protein in order to derive an enhanced immune response, 
particularly by giving the response T-cell dependent character. This approach has 
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been used in the development of a vaccine against Haemophilus influenzae, for 
instance. There are, however, issues of cost concerning both the multi- 
poly saccharide vaccines and those based on conjugates. 

5 A third approach is to look for other antigenic components which offer the potential 
to be vaccine candidates. This is the basis of the present invention. Using a specially 
developed bacterial expression system, we have been able to identify a group of 
protein antigens from pneomococcus which are associated with the bacterial 
envelope or which are secreted. 

10 

Thus, in a first aspect the present invention provides a Streptococcus pneumoniae 
protein or polypeptide having a sequence selected from those shown in table 1 . 

In a second aspect, the present invention provides a Streptococcus pneumoniae 
15 protein or polypeptide having a sequence selected from those shown in table 2. 

A protein or polypeptide of the present invention may be provided in substantially pure 
form. For example, it may be provided in a form which is substantially free of other * 
proteins. 

20 

As discussed herein, the proteins and polypeptides of the invention are useful as 
antigenic material. Such material can be "antigenic" and/or "immunogenic". 
Generally, "antigenic" is taken to mean that the protein or polypeptide is capable of 
being used to raise antibodies or indeed is capable of inducing an antibody response in 
25 a subject. "Immunogenic" is taken to mean that the protein or polypeptide is capable of 
eliciting a protective immune response in a subject. Thus, in the latter case, the protein 
or polypeptide may be capable of not only generating an antibody response but, in 
addition, a non-antibody based immune response. 
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The skilled person will appreciate that homologues or derivatives of the proteins or 
polypeptides of the invention will also find use in the context of the present invention, 
ie as antigenic/immunogenic material. Thus, for instance proteins or polypeptides 
5 which include one or more additions, deletions, substitutions or the like are 
encompassed by the present invention. In addition, it may be possible to replace one 
amino acid with another of similar "type" . For instance replacing one hydrophobic 
amino acid with another. 

One can use a program such as the CLUSTAL program to compare amino acid 
10 sequences. This program compares amino acid sequences and finds the optimal 
alignment by inserting spaces in either sequence as appropriate. It is possible to 
calculate amino acid identity or similarity (identity plus conservation of amino acid 
type) for an optimal alignment. A program like BLASTx will align the longest stretch 
of similar sequences and assign a value to the fit. It is thus possible to obtain a 
15 comparison where several regions of similarity are found, each having a different 
score. Both types of identity analysis are contemplated in the present invention. 

In the case of homologues and derivatives, the degree of identity with a protein or 
polypeptide as described herein is less important than that the homologue or derivative 

20 should retain the antigenicity or immunogenicity of the original protein or polypeptide. 
However, suitably, homologues or derivatives having at least 60% similarity (as 
discussed above) with the proteins or polypeptides described herein are provided. 
Preferably, homologues or derivatives having at least 70% similarity, more preferably 
at least 80% similarity are provided. Most preferably, homologues or derivatives 

25 having at least 90% or even 95% similarity are provided. 

In an alternative approach, the homologues or derivatives could be fusion proteins, 
incorporating moieties which render purification easier, for example by effectively 
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tagging the desired protein or polypeptide. It may be necessary to remove the "tag" or 
it may be the case that the fusion protein itself retains sufficient antigenicity to be 
useful. 

In an additional aspect of the invention there are provided antigenic/immunogenic 
fragments of the proteins or polypeptides of the invention, or of homologues or 
derivatives thereof. 

For fragments of the proteins or polypeptides described herein, or of homologues or 
derivatives thereof, the situation is slightly different. It is well known that is possible to 
screen an antigenic protein or polypeptide to identify epitopic regions, ie those regions 
which are responsible for the protein or polypeptide's antigenicity or immunogenicity. 
Methods for carrying out such screening are well known in the art. Thus, the fragments 
of the present invention should include one or more such epitopic regions or be 
sufficiently similar to such regions to retain their antigenic/immunogenic properties. 
Thus, for fragments according to the present invention the degree of identity is perhaps, 
irrelevant, since they may be 100% identical to a particular part of a protein or 
polypeptide, homologue or derivative as described herein. The key issue, once again, is. 
that the fragment retains the antigenic/immunogenic properties. 

Thus, what is important for homologues, derivatives and fragments is that they possess 
at least a degree of the antigenicity/immunogenicity of the protein or polypeptide from 
which they are derived. 

Gene cloning techniques may be used to provide a protein of the invention in 
substantially pure form. These techniques are disclosed, for example, in J. Sambrook 
et al Molecular Cloning 2nd Edition, Cold Spring Harbor Laboratory Press (1989). 
Thus, in a third aspect, the present invention provides a nucleic acid molecule 
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comprising or consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 1 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which has substantial identity with any of those of (i), (ii) 
and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 1 . 

In a fourth aspect the present invention provides a nucleic acid molecule comprising or 
consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 2 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which has substantial identity with any of those of (i), (ii) 
and (iii); or 
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(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 2. 

The nucleic acid molecules of the invention may include a plurality of such sequences, 
and/or fragments. The skilled person will appreciate that the present invention can 
include novel variants of those particular novel nucleic acid molecules which are 
exemplified herein. Such variants are encompassed by the present invention. These 
may occur in nature, for example because of strain variation. For example, additions/ 
substitutions and/or deletions are included. In addition, and particularly when utilising 
microbial expression systems, one may wish to engineer the nucleic acid sequence by 
making use of known preferred codon usage in the particular organism being used for 
expression. Thus, synthetic or non-naturally occurring variants are also included within 
the scope of the invention. 

15 The term "RNA equivalent" when used above indicates that a given RNA molecule has 
a sequence which is complementary to that of a given DNA molecule (allowing for the 
fact that in RNA M U" replaces "T" in the genetic code). 

When comparing nucleic acid sequences for the purposes of determining the degree of 
20 homology or identity one can use programs such as BESTFIT and GAP (both from the 
Wisconsin Genetics Computer Group (GCG) software package) BESTFIT, for 
example, compares two sequences and produces an optimal alignment of the most 
similar segments. GAP enables sequences to be aligned along their whole length and 
finds the optimal alignment by inserting spaces in either sequence as appropriate. 
25 Suitably, in the context of the present invention when discussing identity of nucleic acid 
sequences, the comparison is made by alignment of the sequences along their whole 
length. 
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Preferably, sequences which have substantial identity have at least 50% sequence 
identity, desirably at least 75% sequence identity and more desirably at least 90 or at 
least 95% sequence identity with said sequences. In some cases the sequence identity 
may be 99% or above. 

Desirably, the term "substantial identity" indicates that said sequence has a greater 
degree of identity with any of the sequences described herein than with prior art nucleic 
a'cid sequences. 

It should however be noted that where a nucleic acid sequence of the present invention 
codes for at least part of a novel gene product the present invention includes within its 
scope all possible sequence coding for the gene product or for a novel part thereof. 

The nucleic acid molecule may be in isolated or recombinant form. It may be 
incorporated into a vector and the vector may be incorporated into a host. Such vectors 
and suitable hosts form yet further aspects of the present invention. 

Therefore, for example, by using probes based upon the nucleic acid sequences 
provided herein, genes in Streptococcus pneumoniae can be identified. They can then 
be excised using restriction enzymes and cloned into a vector. The vector can be 
introduced into a suitable host for expression. 

Nucleic acid molecules of the present invention may be obtained from S.pneumoniae by 
the use of appropriate probes complementary to part of the sequences of the nucleic 
acid molecules. Restriction enzymes or sonication techniques can be used to obtain 
appropriately sized fragments for probing. 
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Alternatively PCR techniques may be used to amplify a desired nucleic acid sequence. 
Thus the sequence data provided herein can be used to design two primers for use in 
PCR so that a desired sequence, including whole genes or fragments thereof, can be 
targeted and then amplified to a high degree. 

Typically primers will be at least 15-25 nucleotides long. 

As a further alternative chemical synthesis may be used. This may be automated. 
Relatively short sequences may be chemically synthesised and ligated together to 
provide a longer sequence. 

There is another group of proteins from S. pneumoniae which have been identified 
using the bacterial expression system described herein. These are known proteins 
from S. pneumoniae, which have not previously been identified as antigenic proteins. 
The amino acid sequences of this group of proteins, together with DNA sequences 
coding for them are shown in Table 3. These proteins, or homologues, derivatives 
and/or fragments thereof also find use as antigens/immunogens. Thus, in another 
aspect the present invention provides the use of a protein or polypeptide having a 
sequence selected from those shown in Tables 1-3, or homologues, derivatives 
and/or fragments thereof, as an immunogen/antigen. 

In yet a further aspect the present invention provides an immunogenic/antigenic 
composition comprising one or more proteins or polypeptides selected from those 
whose sequences are shown in Tables 1-3, or homologues or derivatives thereof, 
and/or fragments of any of these. In preferred embodiments, the 
immunogenic/antigenic composition is a vaccine or is for use in a diagnostic assay. 

In the case of vaccines suitable additional excipients, diluents, adjuvants or the like 
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may be included. Numerous examples of these are well known in the art. 

It is also possible to utilise the nucleic acid sequences shown in Tables 1-3 in the 
preparation of so-called DNA vaccines. Thus, the invention also provides a vaccine 
composition comprising one or more nucleic acid sequences as defined herein. DNA 
vaccines are described in the art (see for instance, Donnelly et al , Ann. Rev, 
Immunol, 15:617-648 (1997)) and the skilled person can use such art described 
techniques to produce and use DNA vaccines according to the present invention. 

As already discussed herein the proteins or polypeptides described herein, their 
homologues or derivatives, and/or fragments of any of these, can be used in methods 
of detecting/diagnosing S. pneumoniae. Such methods can be based on the detection 
of antibodies against such proteins which may be present in a subject. Therefore the 
present invention provides a method for the detection/diagnosis of S.pneumoniae 
which comprises the step of bringing into contact a sample to be tested with at least 
one protein, or homologue, derivative or fragment thereof, as described herein. 
Suitably, the sample is a biological sample, such as a tissue sample or a sample of 
blood or saliva obtained from a subject to be tested. 

In an alternative approach, the proteins described herein, or homologues, derivatives 
and/or fragments thereof, can be used to raise antibodies, which in turn can be used 
to detect the antigens, and hence S. pneumoniae. Such antibodies form another aspect 
of the invention. Antibodies within the scope of the present invention may be 
monoclonal or polyclonal. 

Polyclonal antibodies can be raised by stimulating their production in a suitable animal 
host (e.g. a mouse, rat, guinea pig, rabbit, sheep, goat or monkey) when a protein as 
described herein, or a homologue, derivative or fragment thereof, is injected into the 
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animal. If desired, an adjuvant may be administered together with the protein. Well- 
known adjuvants include Freund's adjuvant (complete and incomplete) and aluminium 
hydroxide. The antibodies can then be purified by virtue of their binding to a protein as 
described herein. 

5 

Monoclonal antibodies can be produced from hybridomas. These can be formed by 
fusing myeloma cells and spleen cells which produce the desired antibody in order to 
form an immortal cell line. Thus the well-known Kohler & Milstein technique (Nature 
256 (1975)) or subsequent variations upon this technique can be used. 

10 

Techniques for producing monoclonal and polyclonal antibodies that bind to a 
particular polypeptide/protein are now well developed in the art. They are discussed in 
standard immunology textbooks, for example in Roitt et al, Immunology second edition 
(1989), Churchill Livingstone, London. 

15 

In addition to whole antibodies, the present invention includes derivatives thereof which :. 
are capable of binding to proteins etc as described herein. Thus the present invention :. 
includes antibody fragments and synthetic constructs. Examples of antibody fragments 
and synthetic constructs are given by Dougall et al in Tibtech 12 372-379 (September 
20 1994). 

Antibody fragments include, for example, Fab, F(ab') 2 and Fv fragments. Fab 
fragments (These are discussed in Roitt et al [supra] ). Fv fragments can be modified 
to produce a synthetic construct known as a single chain Fv (scFv) molecule. This 
25 includes a peptide linker covalently joining V h and V, regions, which contributes to the 
stability of the molecule. Other synthetic constructs that can be used include CDR 
peptides. These are synthetic peptides comprising antigen-binding determinants. 
Peptide mimetics may also be used. These molecules are usually conformationally 
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restricted organic rings that mimic the structure of a CDR loop and that include 
antigen-interactive side chains. 

Synthetic constructs include chimaeric molecules. Thus, for example, humanised (or 
primatised) antibodies or derivatives thereof are within the scope of the present 
invention. An example of a humanised antibody is an antibody having human 
framework regions, but rodent hypervariable regions. Ways of producing chimaeric 
antibodies are discussed for example by Morrison et al in PNAS, 81, 6851-6855 (1984) 
and by Takeda et al in Nature. 314, 452-454 (1985). 

Synthetic constructs also include molecules comprising an additional moiety that 
provides the molecule with some desirable property in addition to antigen binding. For 
example the moiety may be a label (e.g. a fluorescent or radioactive label). 
Alternatively, it may be a pharmaceutical^ active agent. 

Antibodies, or derivatives thereof, find use in detection/diagnosis of S. pneumoniae. 
Thus, in another aspect the present invention provides a method for the 
detection/diagnosis of S.pneumoniae which comprises the step of bringing into contact 
a sample to be tested and antibodies capable of binding to one or more proteins 
described herein, or to homologues, derivatives and/or fragments thereof. 

In addition, so-called "Affibodies" may be utilised. These are binding proteins 
selected from combinatorial libraries of an alpha-helical bacterial receptor domain 
(Nord et al , ) Thus, Small protein domains, capable of specific binding to different 
target proteins can be selected using combinatorial approaches. 

It will also be clear that the nucleic acid sequences described herein may be used to 
detect/diagnose S.pneumoniae. Thus, in yet a further aspect, the present invention 
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provides a method for the detection/diagnosis of S. pneumoniae which comprises the 
step of bringing into contact a sample to be tested with at least one nucleic acid 
sequence as described herein. Suitably, the sample is a biological sample, such as a 
tissue sample or a sample of blood or saliva obtained from a subject to be tested. 
5 Such samples may be pre-treated before being used in the methods of the invention. 
Thus, for example, a sample may be treated to extract DNA. Then, DNA probes 
based on the nucleic acid sequences described herein (ie usually fragments of such 
sequences) may be used to detect nucleic acid from S. pneumoniae. 

10 In additional aspects, the present invention provides: 

(a) a method of vaccinating a subject against S. pneumoniae which comprises the 
step of administering to a subject a protein or polypeptide of the invention, or a 
derivative, homologue or fragment thereof, or an immunogenic composition of the 

15 invention; 

(b) a method of vaccinating a subject against S. pneumoniae which comprises the 
step of administering to a subject a nucleic acid molecule as defined herein; 

20 (c) a method for the prophylaxis or treatment of S.pneumoniae infection which 
comprises the step of administering to a subject a protein or polypeptide of the 
invention, or a derivative, homologue or fragment thereof, or an immunogenic 
composition of the invention; 

25 (d) a method for the prophylaxis or treatment of S.pneumoniae infection which 
comprises the step of administering to a subject a nucleic acid molecule as defined 
herein; 
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(e) a kit for use in detecting/diagnosing S. pneumoniae infection comprising one 
or more proteins or polypeptides of the invention, or homologues, derivatives or 
fragments thereof, or an antigenic composition of the invention; and 

(f) a kit for use in detecting/diagnosing 5. pneumoniae infection comprising one 
or more nucleic acid molecules as defined herein. 

Given that we have identified a group of important proteins, such proteins are 
potential targets for anti-microbial therapy. It is necessary, however, to determine 
whether each individual protein is essential for the organism's viability. Thus, the 
present invention also provides a method of determining whether a protein or 
polypeptide as described herein represents a potential anti-microbial target which 
comprises antagonising, inhibiting or otherwise interfering with the function or 
expression of said protein and determining whether 5. pneumoniae is still viable. 

A suitable method for inactivating the protein is to effect selected gene knockouts, ie 
prevent expression of the protein and determine whether this results in a lethal 
change. Suitable methods for carrying out such gene knockouts are described in Li 
et al , P.N.A,S., 94:13251-13256 (1997) and Kolkman et al , 178:3736- 
3741 (1996). 

In a final aspect the present invention provides the use of an agent capable of 
antagonising, inhibiting or otherwise interfering with the function or expression of a 
protein or polypeptide of the invention in the manufacture of a medicament for use in 
the treatment or prophylaxis of S. pneumoniae infection. 

As mentioned above, we have used a bacterial expression system as a means of 
identifying those proteins which are surface associated, secreted or exported and 
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thus, would find use as antigens. 

The information necessary for the secretion/export of proteins has been extensively 
studied in bacteria. In the majority of cases, protein export requires a signal peptide 
5 to be present at the N-terminus of the precursor protein so that it becomes directed to 
the translocation machinery on the cytoplasmic membrane. During or after 
translocation, the signal peptide is removed by a membrane associated signal 
peptidase. Ultimately the localization of the protein (i.e. whether it be secreted, an 
integral membrane protein or attached to the cell wall) is determined by sequences 
10 other than the leader peptide itself. 

We are specifically interested in surface located or exported proteins as these are 
likely to be antigens for use in vaccines, as diagnostic reagents or as targets for 
therapy with novel chemical entities. We have therefore developed a screening 

15 vector-system in Lactococcus lactis that permits genes encoding exported proteins to 
be identified and isolated. We provide below a representative example showing how 
given novel surface associated proteins from Streptococcus pneumoniae have been 
identified and characterized. The screening vector incorporates the staphylococcal 
nuclease gene nuc lacking its own export signal as a secretion reporter. 

20 Staphylococcal nuclease is a naturally secreted heat-stable, monomeric enzyme 
which has been efficiently expressed and secreted in a range of Gram positive 
bacteria (Shortle, Gene, 22:181-189 (1983); Kovacevic et aL, 7. BacterioL, 
162:521-528 (1985); Miller et aL, J, BacterioL, 169:3508-3514 (1987); Liebl et al., 
7. BacterioL, 174:1854-1861 (1992); Le Loir et aL, J. BacterioL, 176:5135-5139 

25 (1994); Poquet etal. t J. BacterioL, 180:1904-1912 (1998)). 

Recently, Poquet et al. ((1998), supra) have described a screening vector 
incorporating the nuc gene lacking its own signal leader as a reporter to identify 
exported proteins in Gram positive bacteria, and have applied it to L. lactis. This 
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vector (pFUN) contains the pAMpi replicon which functions in a broad host range 
of Gram-positive bacteria in addition to the ColEl replicon that promotes replication 
in Escherichia coli and certain other Gram negative bacteria. Unique cloning sites 
present in the vector can be used to generate transcriptional and translational fusions 
5 between cloned genomic DNA fragments and the open reading frame of the 

truncated nuc gene devoid of its own signal secretion leader. The nuc gene makes an 
ideal reporter gene because the secretion of nuclease can readily be detected using a 
simple and sensitive plate test: Recombinant colonies secreting the nuclease develop 
a pink halo whereas control colonies remain white (Shortle, (1983), supra; Le Loir 
10 etal, (1994), supra). 

Thus, the invention will now be described with reference to the following 
representative example, which provides details of how the proteins, polypeptides and 
nucleic acid sequences described herein identified as antigenic targets. 

15 

We describe herein the construction of three reporter vectors and their use in L. 
lactis to identify and isolate genomic DNA fragments from Streptococcus 
pneumoniae encoding secreted or surface associated proteins. 

The invention will now be described with reference to the examples, which should 
20 not be construed as in any way limiting the invention. The examples refer to the 
figures in which: 

Figure 1: shows the results of a number of DNA vaccine trials; and 
25 Figure 2: shows the results of further DNA vaccine trials. 

EXAMPLE 1 

(i) Construction of the pTREPl-nuc series of reporter vectors 

30 
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(a) Construction of expression plasmid pTREPl 

The pTREPl plasmid is a high-copy number (40-80 per cell) theta-replicating gram 
positive plasmid, which is a derivative of the pTREX plasmid which is itself a 
derivative of the previously published pIL253 plasmid. pIL253 incorporates the 
broad Gram-positive host range replicon of pAMpl (Simon and Chopin, Biochimie, 
70:559-567 (1988)) and is non-mobilisable by the L lactis sex-factor. pIL253 also 
lacks the tra function which is necessary for transfer or efficient mobilisation by 
conjugative parent plasmids exemplified by pIL501. The Enterococcal pAMpl 
replicon has previously been transferred to various species including Streptococcus, 
Lactobacillus and Bacillus species as well as Clostridium acetobutylicum, (Oultram 
and Klaenhammer, FEMS Microbiological Letters, 27:129-134 (1985); Gibson et 
al. y (1979); LeBlanc et aL y Proceedings of the National Academy of Science USA, 
75:3484-3487 (1978)) indicating the potential broad host range utility. The pTREPl 
plasmid represents a constitutive transcription vector. 

The pTREX vector was constructed as follows. An artificial DNA fragment 
containing a putative RNA stabilising sequence, a translation initiation region (TIR), 
a multiple cloning site for insertion of the target genes and a transcription terminator 
was created by annealing 2 complementary oligonucleotides and extending with Tfl 
DNA polymerase. The sense and anti-sense oligonucleotides contained the 
recognition sites for Nhel and BamHI at their 5' ends respectively to facilitate 
cloning. This fragment was cloned between the Xbal and BamHI sites in 
pUC19NT7, a derivative of pUC19 which contains the T7 expression cassette from 
pLETl (Wells et al , /. Appl. BacterioL, 74:629-636 (1993)) cloned between the 
EcoRI and Hindlll sites. The resulting construct was designated pUCLEX. The 
complete expression cassette of pUCLEX was then removed by cutting with Hindlll 
and blunting followed by cutting with EcoRI before cloning into EcoRI and SacI 
(blunted) sites of pIL253 to generate the vector pTREX (Wells and Schofield, In 
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Current advances in metabolism, genetics and applications-NATO ASI Series, H 
98:37-62 (1996)). The putative RNA stabilising sequence and TIR are derived from 
the Escherichia coli T7 bacteriophage sequence and modified at one nucleotide 
position to enhance the complementarity of the Shine Dalgarno (SD) motif to the 
ribosomal 16s RNA of Lactococcus lactis (Schofield et al. pers. corns. University of 
Cambridge Dept. Pathology.); 

A Lactococcus lactis MG1363 chromosomal DNA fragment exhibiting promoter 
activity which was subsequently designated P7 was cloned between the EcoRI and 
Bglll sites present in the expression cassette, creating pTREX7. This active 
promoter region had been previously isolated using the promoter probe vector 
pSB292 (Waterfield et al, Gene, 165:9-15 (1995)). The promoter fragment was 
amplified by PCR using the Vent DNA polymerase according to the manufacturer. 

The pTREPl vector was then constructed as follows. An artificial DNA fragment 
which included a transcription terminator, the forward pUC sequencing primer, a 
promoter multiple -cloning site region and a universal translation stop sequence was 
created by annealing two overlapping partially complementary synthetic 
oligonucleotides together and extending with sequenase according to manufacturers 
instructions. The sense and anti-sense (pTREPF and pTREPR) oligonucleotides 
contained the recognition sites for EcoRV and BamHI at their 5* ends respectively to 
facilitate cloning into pTREX7. The transcription terminator was that of the Bacillus 
penicillinase gene, which has been shown to be effective in Lactococcus (Jos et al., 
Applied and Environmental Microbiology, 50:540-542 (1985)). This was considered 
necessary as expression of target genes in the pTREX vectors was observed to be 
leaky and is thought to be the result of cryptic promoter activity in the origin region 
(Schofield et al. pers. corns. University of Cambridge Dept. Pathology.). The 
forward pUC primer sequencing was included to enable direct sequencing of cloned 
DNA fragments. The translation stop sequence which encodes a stop codon in 3 
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different frames was included to prevent translational fusions between vector genes 
and cloned DNA fragments. The pTREX7 vector was first digested with EcoRI and 
blunted using the 5* - 3* polymerase activity of T4 DNA polymerase (NEB) 
according to manufacturer's instructions. The EcoRI digested and blunt ended 
5 pTREX7 vector was then digested with Bgl II thus removing the P7 promoter. The 
artificial DNA fragment derived from the annealed synthetic oligonucleotides was 
then digested with EcoRV and Bam HI and cloned into the EcoRI(blunted)-Bgl II 
digested pTREX7 vector to generate pTREP. A Lactococcus lactis MG1363 
chromosomal promoter designated PI was then cloned between the EcoRI and Bglll 

10 sites present in the pTREP expression cassette forming pTREPl . This promoter was 
also isolated using the promoter probe vector pSB292 and characterised by 
Waterfield et al. y (1995), supra. The PI promoter fragment was originally 
amplified by PCR using vent DNA polymerase according to manufacturers 
instructions and cloned into the pTREX as an EcoRI-Bglll DNA fragment. The 

15 EcoRI-Bglll PI promoter containing fragment was removed from pTREXl by 

restriction enzyme digestion and used for cloning into pTREP (Schofield et aL pers. 
corns. University of Cambridge, Dept. Pathology.). 

(b) PCR amplification of the 5. aureus nuc gene . 

20 

The nucleotide sequence of the 5. aureus nuc gene (EMBL database accession 
number V01281) was used to design synthetic oligonucleotide primers for PCR 
amplification. The primers were designed to amplify the mature form of the nuc 
gene designated nuc A which is generated by proteolytic cleavage of the N-terminal 
25 19 to 21 amino acids of the secreted propeptide designated Snase B (Shortle, (1983), 
supra). Three sense primers (nucSl, nucS2 and nucS3, Appendix 1) were designed, 
each one having a blunt-ended restriction endonuclease cleavage site for EcoRV or 
Smal in a different reading frame with respect to the nuc gene. Additionally Bglll 
and BamHI were incorporated at the 5' ends of the sense and anti-sense primers 
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respectively to facilitate cloning into BamHI and Bglll cut pTREPl. The sequences 
of all the primers are given in Appendix 1. Three nuc gene DNA fragments 
encoding the mature form of the nuclease gene (NucA) were amplified by PCR using 
each of the sense primers combined with the anti-sense primer described above. The 
5 nuc gene fragments were amplified by PCR using S. aureus genomic DNA template, 
Vent DNA Polymerase (NEB) and the conditions recommended by the 

manufacturer. An initial denaturation step at 93 °C for 2 min was followed by 30 

cycles of denaturation at 93 °C for 45 sec, annealing at 50 °C for 45 seconds, and 

extension at 73 °C for 1 minute and then a final 5 min extension step at 73 °C. The 
10 PCR amplified products were purified using a Wizard clean up column (Promega) to 
remove unincorporated nucleotides and primers. 

(c) Construction of the pTREPl -nuc vectors 

15 The purified nuc gene fragments described in section b were digested with Bgl II and 
BamHI using standard conditions and ligated to BamHI and Bglll cut and 
dephosphorylated pTREPl to generate the pTREPl-nucl, pTREPl -nuc2 and 
pTREPl-nuc3 series of reporter vectors. General molecular biology techniques were 
carried out using the reagents and buffer supplied by the manufacture or using 

20 standard conditions(Sambrook and Maniatis, (1989), supra). In each of the pTREPl - 
nuc vectors the expression cassette comprises a transcription terminator, lactococcal 
promoter PI, unique cloning sites (Bglll, EcoRV or Smal) followed by the mature 
form of the nuc gene and a second transcription terminator. Note that the sequences 
required for translation and secretion of the nuc gene were deliberately excluded in 

25 this construction. Such elements can only be provided by appropriately digested 

foreign DNA fragments (representing the target bacterium) which can be cloned into 
the unique restriction sites present immediately upstream of the nuc gene. 
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In possessing a promoter, the pTREPl-nuc vectors differ from the pFUN vector 
described by Poquet et al. (1998), supra, which was used to identify L. lactis 
exported proteins by screening directly for Nuc activity directly in L. lactis. As the 
pFUN vector does not contain a promoter upstream of the nuc open reading frame 
5 the cloned genomic DNA fragment must also provide the signals for transcription in 
addition to those elements required for translation initiation and secretion of Nuc. 
This limitation may prevent the isolation of genes that are distant from a promoter 
for example genes which are within polycistronic operons. Additionally there can be 
no guarantee that promoters derived from other species of bacteria will be 
10 recognised and functional in L. lactis. Certain promoters may be under stringent 

regulation in the natural host but not in L. lactis. In contrast, the presence of the PI 
promoter in the pTREPl-nuc series of vectors ensures that promoterless DNA 
fragments (or DNA fragments containing promoter sequences not active in L. lactis) 
will still be transcribed. 

15 

(d) Screening for secreted proteins in 5. pneumoniae 

Genomic DNA isolated from S. pneumoniae was digested with the restriction 
enzyme Tru9I. This enzyme which recognises the sequence 5'- TTAA -3' was used 

20 because it cuts A/T rich genomes efficiently and can generate random genomic 
DNA fragments within the preferred size range (usually averaging 0.5 -1.0 kb). 
This size range was preferred because there is an increased probability that the PI 
promoter can be utilised to transcribe a novel gene sequence. However, the PI 
promoter may not be necessary in all cases as it is possible that many Streptococcal 

25 promoters are recognised in L. lactis. DNA fragments of different size ranges were 
purified from partial Tru9I digests of S. pneumoniae genomic DNA. As the Tru 91 
restriction enzyme generates staggered ends the DNA fragments had to be made 
blunt ended before ligation to the EcoRV or Smal cut pTREPl-nuc vectors. This 
was achieved by the partial fill-in enzyme reaction using the 5' -3' polymerase 
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activity of Klenow enzyme. Briefly Tru9I digested DNA was dissolved in a solution 
(usually between 10-20 ^1 in total) supplemented with T4 DNA ligase buffer (New 
England Biolabs; NEB) (IX) and 33 /iM of each of the required dNTPs, in this case 
dATP and dTTP. Klenow enzyme was added (1 unit Klenow enzyme (NEB) per fig 
5 of DNA) and the reaction incubated at 25°C for 15 minutes. The reaction was 
stopped by incubating the mix at 75°C for 20 minutes. EcoRV or Smal digested 
pTREP-nuc plasmid DNA was then added (usually between 200-400 ng). The mix 
was then supplemented with 400 units of T4 DNA ligase (NEB) and T4 DNA ligase 
buffer (IX) and incubated overnight at 16°C. The ligation mix was precipitated 
10 directly in 100% Ethanol and 1/10 volume of 3M sodium acetate (pH 5.2) and used 
to transform L. lactis MG1363 (Gasson, 1983). Alternatively, the gene cloning site 
of the pTREP-nuc vectors also contains a BglH site which can be used to clone for 
example Sau3AI digested genomic DNA fragments. 

L. lactis transformant colonies were grown on brain heart infusion agar and nuclease 

15 secreting (Nuc + ) clones were detected by a toluidine blue-DNA-agar overlay (0.05 
M Tris pH 9.0, 10 g of agar per litre, 10 g of NaCl per liter, 0.1 mM CaC12, 0.03% 
wt/vol. salmon sperm DNA and 90 mg of Toluidine blue O dye) essentially as 
described by Shortle, 1983, supra and Le Loir et aL, 1994, supra). The plates were 
then incubated at 37°C for up to 2 hours. Nuclease secreting clones develop an 

20 easily identifiable pink halo. Plasmid DNA was isolated from Nuc+ recombinant L. 
lactis clones and DNA inserts were sequenced on one strand using the NucSeq 
sequencing primer described in Appendix 1 , which sequences directly through the 
DNA insert. 



25 Isolation of Genes Encoding Exported Proteins from 
S. pneumoniae 

A large number of gene sequences putatively encoding exported proteins in 5. 
pneumoniae have been identified using the nuclease screening system. These have 
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now been further analysed to remove artefacts. The sequences identified using the 
screening system have been analysed using a number of parameters. 

1. All putative surface proteins were analysed for leader/signal peptide 
5 sequences using the software programs Sequencher (Gene Codes Corporation) and 
DNA Strider (Marck, Nucleic Acids Res. f 16:1829-1836 (1988)). Bacterial signal 
peptide sequences share a common design. They are characterised by a short 
positively charged N-terminus (N region) immediately preceding a stretch of 
hydrophobic residues (central portion-h region) followed by a more polar C -terminal 

10 portion which contains the cleavage site (c-region). Computer software is available 
which allows hydropathy profiling of putative proteins and which can readily 
identify the very distinctive hydrophobic portion (h-region) typical of leader peptide 
sequences. In addition, the sequences were checked for the presence of or absence of 
a potential ribosomal binding site (Shine-Dalgarno motif) required for translation 

15 initiation of the putative nuc reporter fusion protein. 

2. All putative surface protein sequences were also matched with all of the 
protein/DNA sequences using the publicly databases [OWL-proteins inclusive of 
SwissProt and GenBank translations]. This allows us to identify sequences similar to 
known genes or homologues of genes for which some function has been ascribed. 

20 Hence it has been possible to predict a function for some of the genes identified 

using the LEEP system and to unequivocally establish that the system can be used to 
identify and isolate gene sequences of surface associated proteins. We should also be 
able to confirm that these proteins are indeed surface related and not artifacts. The 
LEEP system has been used to identify novel gene targets for vaccine and therapy. 

25 3. Some of the genes identified proteins did not possess a typical leader 

peptide sequence and did not show homology with any DNA/protein sequences in 
the database. Indeed these proteins may indicate the primary advantage of our 
screening method, i.e. the isolation of atypical surface-related proteins, which may 



WO 00/06738 



24 



PCT/GB99/02452 



have been missed in all previously described screening protocols or approaches 
based on sequence homology searches. 

In all cases, only partial gene sequences were initially obtained. Full length genes 
were obtained in all cases by reference to the TIGR S. pneumoniae database 
( www@tigr.org) . Thus, by matching the originally obtained partial sequences with 
the database, we were able to identify the full length gene sequences. In this way, as 
described herein, three groups of genes were clearly identified, ie a group of genes 
encoding previously unidentified S. pneumoniae proteins, a second group exhibiting 
some homology with known proteins from a variety of sources and a third group 
which encoded known S. pneumoniae proteins, which were, however, not known as 
antigens. 

Example 2: Vaccine trials 
pcDNA3.1+ as a DNA vaccine vector 
pcPNA3,l + 

The vector chosen for use as a DNA vaccine vector was pcDNA3.1 (Invitrogen) 
(actually pcDNA3.1 + , the forward orientation was used in all cases but may be 
referred to as pcDNA3.1 here on). This vector has been widely and successfully 
employed as a host vector to test vaccine candidate genes to give protection against 
pathogens in the literature (Zhang, et aL, Kurar and Splitter, Anderson et aL). The 
vector was designed for high-level stable and non-replicative transient expression in 
mammalian cells. pcDNA3.1 contains the ColEl origin of replication which allows 
convenient high-copy number replication and growth in E. coli. This in turn allows 
rapid and efficient cloning and testing of many genes. The pcDNA3.1 vector has a 
large number of cloning sites and also contains the gene encoding ampicillin 
resistance to aid in cloning selection and the human cytomegalovirus (CMV) 
immediate-early promoter/enhancer which permits efficient, high-level expression of 
the recombinant protein. The CMV promoter is a strong viral promoter in a wide 
range of cell types including both muscle and immune (antigen presenting) cells. 
This is important for optimal immune response as it remains unknown as to which 
cells types are most important in generating a protective response in vivo. A T7 
promoter upstream of the multiple cloning site affords efficient expression of the 
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modified insert of interest and which allows in vitro transcription of a cloned gene in 
the sense orientation. 

Zhang, D., Yang, X., Berry, J. Shen, C, McCIarty, G. and Brunham, R.C. (1997) 
5 "DNA vaccination with the major outer-membrane protein genes induces acquired 
immunity to Chlamydia trachomatis (mouse pneumonitis) infection". Infection and 
Immunity, 176, 1035-40. 

Kurar, E. and Splitter, G.A. (1997) "Nucleic acid vaccination of Brucella abortus 
10 ribosomal L7/L12 gene elicits immune response". Vaccine, 15, 1851-57. 

Anderson, R., Gao, X.-M., Papakonstantinopoulou, A., Roberts, M. and Dougan, 
G. (1996) "Immune response in mice following immunisation with DNA encoding 
fragment C of tetanus toxin". Infection and Immunity, 64, 3168-3173. 

1.5 

Preparation of DNA vaccines 

Oligonucleotide primers were designed for each individual gene of interest derived 
using the LEEP system. Each gene was examined thoroughly, and where possible, 

20 primers were designed such that they targeted that portion of the gene thought to 
encode only the mature portion of the gene protein. It was hoped that expressing 
those sequences that encode only the mature portion of a target gene protein, would 
facilitate its correct folding when expressed in mammalian cells. For example, in the 
majority of cases primers were designed such that putative N-terminal signal peptide 

25 sequences would not be included in the final amplification product to be cloned into 
the pcDNA3. 1 expression vector. The signal peptide directs the polypeptide 
precursor to the cell membrane via the protein export pathway where it is normally 
cleaved off by signal peptidase I (or signal peptidase II if a lipoprotein). Hence the 
signal peptide does not make up any part of the mature protein whether it be 

30 displayed on the surface of the bacteria surface or secreted. Where an N-terminal 
leader peptide sequence was not immediately obvious, primers were designed to 
target the whole of the gene sequence for cloning and ultimately, expression in 
pcDNA3.1. 

35 Having said that, however, other additional features of proteins may also affect the 
expression and presentation of a soluble protein. DNA sequences encoding such 
features in the genes encoding the proteins of interest were excluded during the 
design of oligonucleotides. These features included: 

40 1. LPXTG cell wall anchoring motifs. 

2. LXXC ipoprotein attachment sites. 

3. Hydrophobic C-terminal domain. 
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4. Where no N -terminal signal peptide or LXXC was present the start codon was 
excluded. 

5. Where no hydrophobic C -terminal domain or LPXTG motif was present the stop 
codon was removed. 

5 

Appropriate PCR primers were designed for each gene of interest and any and all of 
the regions encoding the above features was removed from the gene when designing 
these primers. The primers were designed with the appropriate enzyme restriction 
site followed by a conserved Kozak nucleotide sequence (in most cases(NB except in 
10 occasional instances for example ID59) GCCACC was used. The Kozak sequence 
facilitates the recognition of initiator sequences by eukaryotic ribosomes) and an 
ATG start codon upstream of the insert of the gene of interest. For example the 
forward primer using a BamH 1 site the primer would begin 

GCGGGATCCGCCACCATG followed by a small section of the 5' end of the gene 
15 of interest. The reverse primer was designed to be compatible with the forward 
primer and with a Notl restriction site at the 5' end in most cases (this site is 
TTGCGGCCGC) (NB except in occasional instances for example ID59 where a 
Xhol site was used instead of Notl). 

20 PCR primers 

The following PCR primers were designed and used to amplify the truncated genes 
of interest. 

25 IDS 

Forward Primer 5* 

CGGATCCGCCACCATGGGTCTA ATTGA AGACTTA A AAAATCA A 3 1 
Reverse Primer 5' TTGCGGCCGCCAATGCTAGACTAAACACAAGACTCA 3' 

30 

ID59 

Forward Primer 5* CGCGGATCCATGAAAAAAATCTATTCATTTTTAGCA 3' 
Reverse Primer 5' CCCTCGAGGGCTACTTCCGATACATTTTAAACTGTAGG 
35 3' 



ID51 

40 Forward Primer 5' CGGATCCGCCACCATGAGTCATGTCGCTGCAAATG 3' 
Reverse Primer 5' TTGCGGCCGC ATACC A A ACGCTGACATCTACG 3* 
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ID29 

Forward Primer 5* CGGATCCGCCACCATGCAAAAAGAGCGGTATGGTTATG 
3' 

Reverse Primer 5* TTGCGGCCGCACCCCCATTCTTAATCCCTT 3' 
ID50 

Forward Primer 5' 

CGG ATCCGCCACC ATGGAGGTATGTG A A ATGTC ACGTA A A 3 ' 
Reverse Primer 5' TTGCGGCCGCTTTTACAAAGTCAAGCAAAGCC 3' 

Cloning 

The insert along with the flanking features described above was amplified using PGR 
against a template of genomic DNA isolated from type 4 S. pneumoniae strain 11886 
obtained from the National Collection of Type Cultures. The PCR product was cut 
with the appropriate restriction enzymes and cloned in to the multiple cloning site of 
pcDNA3.1 using conventional molecular biological techniques. Suitably mapped 
clones of the genes of interested were cultured and the plasmids isolated on a large 
scale (> 1.5 mg) using Plasmid Mega Kits (Qiagen). Successful cloning and 
maintenance of genes was confirmed by restriction mapping and sequencing — 700 
base pairs through the 5' cloning junction of each large scale preparation of each 
construct. 

Strain validation 

A strain of type 4 was used in cloning and challenge methods which is the strain 
from which the S. pneumoniae genome was sequenced. A freeze dried ampoule of a 
homogeneous laboratory strain of type 4 S. pneumoniae strain NCTC 11886 was 
obtained from the National Collection of Type Strains. The ampoule was opened and 
the cultured re suspended with 0.5 ml of tryptic soy broth (0.5% glucose, 5% 
blood). The suspension was subcultured into 10 ml tryptic soy broth (0.5% glucose, 
5% blood) and incubated statically overnight at 37 °C. This culture was streaked on 
to 5 % blood agar plates to check for contaminants and confirm viability and on to 
blood agar slopes and the rest of the culture was used to make 20% glycerol stocks. 
The slopes were sent to the Public Health Laboratory Service where the type 4 
serotype was confirmed. 

A glycerol stock of NCTC 1 1886 was streaked on a 5% blood agar plate and 
incubated overnight in a C02 gas jar at 37°C. Fresh streaks were made and optochin 
sensitivity was confirmed. 
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Pneumococcal challenge 

A standard inoculum of type 4 5. pneumoniae was prepared and frozen down by 
5 passaging a culture of pneumococcus lx through mice, harvesting from the blood of 
infected animals, and grown up to a predetermined viable count of around 10 9 
cfu/ml in broth before freezing down. The preparation is set out below as per the 
flow chart. 

10 Streak pneumococcal culture and confirm identity 



15 Grow over-night culture from 4-5 colonies on plate above 



20 Animal passage pneumococcal culture 

(i.p. injection of cardiac bleed to harvest) 



25 



30 



35 



V 

Grow over-night culture from animal passaged pneumococcus 

I 

V 

Grow day culture (to pre-detennined optical density) from over-night of animal 
passage and freeze down at -70 °C - This is standard minimum 



Thaw one aliquot of standard inoculum to viable count 



i 

40 V 



Use standard inoculum to determine effective dose (called Virulence Testing) 
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I 

V 

5 All subsequent challenges - use standard inoculum to effective dose 

An aliquot of standard inoculum was diluted 500x in PBS and used to inoculate the 
mice. 

10 Mice were lightly anaesthetised using halothane and then a dose of 1.4 x 10 5 cfu of 
pneumococcus was applied to the nose of each mouse. The uptake was facilitated by 
the normal breathing of the mouse, which was left to recover on its back. 

5. pneumoniae Vaccine trials 

i * 

Vaccine trials in mice were carried out by the administration of DNA to 6 week old 
CBA/ca mice (Harlan, UK). Mice to be vaccinated were divided into groups of six 
and each group was immunised with recombinant pcDNA3.1+ plasmid DNA 
containing a specific target-gene sequence of interest. A total of 100 fig of DNA in 

20 Dulbecco's PBS (Sigma) was injected intramuscularly into the tibialis anterior 
muscle of both legs (50 /xl in each leg). A boost was carried using the same 
procedure 4 weeks later. For comparison, control groups were included in all 
vaccine trials. These control groups were either unvaccinated animals or those 
administered with non-recombinant pcDNA3.1 + DNA (sham vaccinated) only, 

25 using the same time course described above. 3 weeks after the second immunisation, 
all mice groups were challenged intra-nasally with a lethal dose of S. pneumoniae 
serotype 4 (strain NCTC 11886). The number of bacteria administered was 
monitored by plating serial dilutions of the inoculum on 5% blood agar plates. A 
problem with intranasal immunisations is that in some mice the inoculum bubbles out 

30 of the nostrils, this has been noted in results table and taken account of in 

calculations. A less obvious problem is that a certain amount of the inoculum for 
each mouse may be swallowed. It is assumed that this amount will be the same for 
each mouse and will average out over the course of inoculations. However, the 
sample sizes that have been used are small and this problem may have significant 

35 effects in some experiments. All mice remaining after the challenge were killed 3 or 
4 days after infection. During the infection process, challenged mice were monitored 
for the development of symptoms associated with the onset of S. pneumoniae 
induced-disease. Typical symptoms in an appropriate order included piloerection, 
an increasingly hunched posture, discharge from eyes, increased lethargy and 

40 reluctance to move. The latter symptoms usually coincided with the development of 
a moribund state at which stage the mice were culled to prevent further suffering. 
These mice were deemed to be very close to death, and the time of culling was used 
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to determine a survival time for statistical analysis. Where mice were found dead, 
the survival time was taken as the last time point when the mouse was monitored 
alive. 

5 Interpretation of Results 

A positive result was taken as any DNA sequence that was cloned and used in 
challenge experiments as described above which gave protection against that 
challenge. Protection was taken as those DNA sequences that gave statistically 

10 significant protection (to a 95% confidence level (p<0.05)) and also those which 
were marginal or close to significant using Mann- Whitney or which show some 
protective features for example there were one or more outlying mice or because the 
time to the first death was prolonged. It is acceptable to allow marginal or non- 
significant results to be considered as potential positives when it is considered that 

15 the clarity of some of the results may be clouded by the problems associated with the 
administration of intranasal infections. 
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p value 2 refers to significance tests compared to pcDNA3.1 + vaccinated controls 
Statistical Analyses. 

Trial 1 - None of the other groups had significantly longer survival times than the 
5 controls. The survival times of the unvaccinated and pcDNA3.1 control groups were 
not significantly different. One of the mice from ID5 was an outlying result and the 
mean survival times for IDS were extended but not significantly so. 
Trial 2 - The group vaccinated with ID59 had significantly longer survival times than 
the unvaccinated control group. 
10 Trial 5 - The group vaccinated with ID59 again survived for an average of almost 10 
hours longer than the controls but the results were not quite statistically significant. 
Trial 6 - The group vaccinated with ID51 did not have survival times significantly 
higher than unvaccinated controls (p= <36.0), however, there were 2 outlying mice 
in the vaccinated group. 

15 

Vaccine trials 7 and 8 (See figure 2) 





Mean survival times (hours) 


Mouse 


Unvacc 


ID29 (7) 


Unvacc 


ID50 (8) 


number 


control (7) 




control (8) 






59.6 


73.1 


45.1 


60.6 




47.2 


54.8 


50.8 


60.6 


> 


59.6 


59.3 


60.4 


51.1 




70.9 


54.8* 


55.2 


60.6 


5 


68.6* 


59.3 


45.1 


60.6 


6 


76.0 


54.8 


45.1 


60.6 


Mean 


63.6 


59.35 


50.2 


59.1 


sd 


10.3 


7.1 


6.4 


3.9 


p value 1 




<39.0 




0.0048 



* - bubbled when dosed so may not have received full inoculum. 



20 T - terminated at end of experiment having no symptoms of infection. 
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Numbers in brackets - survival times disregarded assuming incomplete dosing 
p value 1 refers to significance tests compared to unvaccinated controls 

Statistical Analyses. 

Trial 7 - The ID29 vaccinated group showed prolonged times to the first death. T 
Trial 8 - The group vaccinated with ID50 survived significantly longer than 
unvaccinated controls. 
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Appendix I - Oligonucleotide primers 
nucSl 

Bglll EcoRV 
5'- cgagatctgatatctcacaaacagataacggcgtaaatag -3' 

nucS2 

Bgl II Sma I 

5'- gaagatcttccccgggatcacaaacagataacggcgtaaatag -3* 

nucS3 

Bgl II EcoRV 
5'- cgagatctgatatccatcacaaacagataacggcgtaaatag -3' 

15 nucR 

Bam HI 

5 1 - cgggatccttatggacctgaatcagcgttgtc -3 ' 
NucSeq 

20 5 1 - ggatgctttgtttcaggtgtatc -3 ' 
pTREPF 

5 ' - catgatatcggtacctcaagctcatatcattgtccggcaatggtgtgggctttttttgttttagcggataa 
caatttcacac -3' 

25 

pTREPR 

5 ' - gcggatcccccgggcttaattaatgtttaaacaGtagtcgaagatctcgcgaattctcctgtgtgaaatt 
gttatccgcta -3' 

30 pUCF 

5 ' - cgccagggttttcccagtcacgac -3 ' 
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VR 

5'- tcaggggggcggagcctatg -3' 
Vi 

5'- tcgtatgttgtgtggaattgtg -3' 
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V 2 

5'- tccggctcgtatgttgtgtggaattg -3' 
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TABLE 1 



ID4 1200 bo 

5 

ATGAGAAATATGTGGGTTGTAATCAAGGAAACCTATCTTCGACATGTCGAGTCATGGAGTTTCI'I Cri^IATGCTGA 
TTTCGCCGTTCCTCTTTTTAGGAATCTCTGTAGGAATTGGGCATCTCCAAGGTTCTTCT 

AGTGGCAGTAGTGACAACAGTGCCATCTGTAGCAGAAGGACTGAAGAATGTAAATGGTGTTAACTTCGACTATAA 
AGACGAAGCAAGTGCCAAAGAAGCAATTAAAGAAGAAAAATTAAAAGGTTATTTGACCATTGATCAAGAAGATA 

10 GTGTTCTAAAGGCAGTTTATCATGGCGAAACATCGCITGAAAATGGAATTAAATTTGAGGTTACAGGTACACTCA 
ATGAACTGCAAAATCAGCnTAATCGTTCAACTGCTTCCITGTCTCAAGAGCAGGAAAAACGCTTAGCG 
TTCAATTCACAGAAAAGATTGATGAAGCCAAGGAAAATAAAAAGTTTATTCAAACAATTGCAGCAGGTGCCTTAG 
GATTCTTTCTTTATATGATTCTGATTACCTATGCGGGTGTAACAGCTCAGGAAGTTGCCAGTGAAAAAGGCACCAA 
AATTATGGAAGTCGTTTTTTCTAGCATAAGGGCAAGTCACTATTTCTATGCGCGGATGATGGCTCT 

1 5 ATTTTAACGCATATTGGGATCTATGTTGTAGGTGGTCTGGCTGCCGT 

TCAGTCTGGTATTTTGGATCACTTGGGAGATGCTATCTCACTGAATACCTTGCTCTT^ 
TGTACGTAGTCTTGGCAGCCTTCCTAGGATCTATGGTTTCTCGTCCTGAGGACTCAGGGAAAGCCTT 
GATGATTTTGATTA TGGG TGG 1 1 1 11 1 1 GGAGTGACAGCTCTAGGTGCAGCTGGTGACAATCTCCTCTTGAAGATT 
GGTTCTTATATTCCCTTTATTTCGACCTTCTTTATGCCGTTTCGAACGATTAATGACTAT 

20 CATGGATTTCACTTGCTATTACAGTGATTTTTGCGGTGGTAGCAACAGGATTTATCGGACGCATGTATGCTAGTCT 
CGTTCTTCAAACGGATGATTTAGGGATTTGGAAAACCTTTAAACGTGCCnTATCTTATAAATAG 

MIWMWVVnCETYLRHVESWSFFFMVISPFLFLGISVGlGHLQGSSMAKNNKVAVVTTVPSVAEGLKNVNGVNFDYKD 
EASAKEAIKEEKLKGYLTIDQEDSVLKAVYHGETSLENGIKFEVTGTLNELQNQLNRSTASLSQEQEKRLAQTIQFTEKI 
25 DEAKENKKFIQTIAAGALGFFLYMILITYAGVTAQEVASEKGTKIMEVVFSSIRASHYFYARMMALFLVILTH1GIYVVG 
GLAAVLLFKDLPFLAQSGILDHLGDAISLOTLLFILISLFMYVVLAAFLGSMVSRPEDSGKALSPLMILIMGGFFGVTALG 
AAGDNLLLKIGSYIPFISTFFMPFRTINDYAGGAEAWISLAITVIFAVVATGFIGRMYASLVLQTDDLGIWKTFKRALSYK 
Z 

30 IDS 1125 bo 

CCTGGGAAAGTCITGAAAATTATGATAGAATGGTGGAAGGAAAAATTCAGGAGAGTAGTAGTGACTCAAAATGTT 
GAAAGTCnTCTCGTATCCATTGTAATCAGTGCATACAATGAAGAAAAATATCTGCCTGGTCTAATTGAAGACTTAA 
AAAATCAAACCTATCCTAAAGAGGATATTGAAATTCTATTTATAAATGCTATGTCCACAGATGGGACCACAGCT 

35 TCATTCA GCAA TTTATAAAGGAAGATACAGAGTTTAACTCAATTAGATTGTATAACAATCCTAAGAAAAATCAAG 
CTAGT GGTTTT AACCTGGGAGTTAAACATTCTGTAGGGGACCTTATTTTAAAAATTGATGCTCATTCAAAAGTTA^ 
TGAGACTTTTGTAATGAACAATGTGGCTATTATTCAACAAGGTGAATTTGTCTGTGGGGGGCCrAGACCGACGATT 
GTCGAAGGAAAAGGAAAATGGGCAGAGACCTTGCATCITGTTGAGGAAAATATGTTTGGCAGTAGCATTGCCAAT 
TATCGAAATAGTTCTGAGGATAGATATGTTTCnTCTATTTTTCATGGAATGTATAAACGAGAGGTTTTCCAGAAGG 

40 TTGGTTTAGTAAATGAGCAACTTGGCCGAACTGAAGATAATGATATTCATTATAGAATTCGAGAATATGGTTATAA 
AATCCGCTATAGCCCAAGTATTCTATCTTATCAGTATATTCGACCAACATTCAAGAAAATGCTGCATCAAAAGTAT 
TCAA ATGGT TTGTGGATTGGCHTGACAAGTCATGTTCAGTTTAAGTGTTTATCATTATTTCACT 
A TTTG TTT TGAGT CTTGTGTTTAG TCTA GCATTGTTA 

A TTTTCTACr riUGTCA TTACT CACTTTGCTGACITTATTAAAACATAAAAATGGA 
45 ATTTTATTTTCCATTCACITTGCTTATGGCCTTGGGACGATTGTAGGTTTAATTAGAGGATTTAAATG 
AGTACAAGAGAACAATAATTTATTTGGATAAAATAAGCCAAATAAATCAAAATATGCTATAA 

PGKVLKIMIEWWKEKFRRVVVTQNVESLLVSIVISAYNEEKYLPGLIEDLKNQTYPKEDIEILFINAMSTDGTTAIIQQFIK 
EDTEFNSIRLYNNPKKNQASGFNLGVKHSVGDLILKIDAHSKVTETFVMNNVAI10QGEFVCGGPRPTIVEGKGKWAET 
50 LHLVEENMFGSSIANYRNSSEDRYVSSIFHGMYKREVFQKVGLVNEQLGRTEDNDIHYRJREYGYKIRYSPSILSYQYIRP 
TFKKMLHQKYSNGLWIGLTSHVOFKCLSLFHYVPCLFVLSLVFSLALLPITFVFITLLLGAYFLLLSLLTLLTLLKHKNGF 
LIVMPFILFSIHFAYGLGTIVGLIRGFKWKKEYKRTIIYLDKISQINQNMLZ 

55 1D11 696 bp 

A TGAT GAAAGAACAAAATACGATAGAAATCGATGTATTTCAATTAGTTAAAAGCTTGTGGAAACGCAAGCTAATG 
ATTTTAATAGTGGCACTTGTGACAGGTGCGGGGGCTTTTGCATAT^ 

GTACCACGCGAATTTACGTAGTGAATCGCAATCAAGGAGACAAGCCGGGGTTGACAAATCAGGATTTGCAGGCAG 
GAACirATCTGGTAAAAGACTACCGTGAGATTATCCTTTCGCAGGATGTTTTGGAGGAAGTTGTTTCTG 
60 ACTAGATTTGACGCCAAAAGGTTTGGCTAATAAAATTAAAGTGACAGTACCAGTTGATACCCGTATTGTCTCTATT 
TCAGTTAATGATCGAGTTCCrGAAGAGGCAAGCCGTATCGCTAACTCTTTGAGAGAAGTAGCTGCTCAAAAAATT 
ATCAGTATTACTCGTGTTTCTGA CGTGAC AACACTGGAGGAGGCAAGGCCGGCGATATCCCCGTCTTCGCCAAAT 
ATTAAACGCAATACACTAATTGGTTTTTTGGCAGGGGTGATTGGAACTAGTGTTATAGTTCTTCATCTTGAAC^ 
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GGATACTCGTGTGAAACGTCCGGAAGATATCGAAAATACATTGCAGATGACACTTTTGGGAGTTGTGCCAAACTT 
G G GT A AGTTG A A AT AG 

MMKEQ^IEIDVFQLVKSLWKI^LMILIVALVTGAGAFAYSTFIVKPEYTSTTRIYVVNRNQGDKPGLTNQDLOAGTYL 
5 VKDY^III^ODVl^EVVSDLKLDLTPKGLANKIKVTVPVDTRIVSISVNDRVPEEASRIANSLREVAAOKIISITRVSDVT 
TLEEARPAISPSSPNIKRNTLIGFLAGVIGTSVIVLHLELLDTRVKRPEDIENTLOMTLLGVVPNLGKLKZ 

ID19 555 bo 

10 ATGGTAAAAGTAGCAGTTATATTAGCTCAGGGCTTTGAAGAAATTGAAGCCTTGACAGTTGTAGATGTCTTGCGTC 
GAGCCAATATCACATGTGATATGGTTGGTTTTGAAGAGCAAGTAACGGGTTCGCATGCAATCCAAGTAAGAGCAG 
ATCATGTCTTTGATGGAGATTTATCAGACTATGATATGATTGTTCTTCCTGGAGGTATGCCTGGTTCT 
CGTGATAATCAGACCTTGATTCAAGAATTGCAAAGCTTCGAGCAAGAAGGGA^ 

GCACCAATTGCCCTCAATCAAGCAGAGATATTGAAAAATAAGCGATACACTTGTTATGACGGCGTTCAAGAGCAA 
15 ATCCTTGATGGTCACTACGTCAAGGAAACAGTAGTGGTAGATGGTCAGTTGACAACCAGTCGGGGTCCTTCAACA 
GCCCTTGCCTTTGCCTACGAGTTGGTGGAGCAACTAGGAGGGGACGCAGAGAGTTTACGAACAGGAATGCTCTAT 
CGAGATGTCTTTGGTAAAAATCAGTAA 

MVKVAVILAQGFEEIEALTVVDVLRRANITCDMVGFEEQVTGSHAIQVRADHVFDGDLSDYDMIVLPGGMPGSAHLR 
20 DNQTLIQELQSFEQEGKKLAAICAAPIALNOAE1LKNKRYTCYDGVQEQILDGHYVKETVVVDGQLTTSRGPSTALAFA 
YELVEQLGGDAESLRTGMLYRDVFGKNQZ 

ID27 306 bp 

25 GTGGTAGGGATGGTAGAACCAAACCTAGAAAGCCTTATAAAAGATCTTTACAATCATGCTCGACATGATTTGAGT 
GAAGATTTAGTTGCTGCTCTCCTAGAGACTACTAAAAAACTGCCTACTACAAATGAGCAATTGCAGGCAGT^ 
TCTCAGGCCTGGTCAATCGTGAATTGCtCCTAAATCCCAAACATCCAGCACCTGAGTTGCTCAACITGG 
TGTCAAAAGAGAAGAAGCCAAGTACAGAGGAACTGCGACTTCTGCGCTTATGTATGAGGAACTCnTTAAAATGCT 
TTGA 



30 



35 



MVGMVEPNLESLIKDLYNHARHDLSEDLVAALLETTKKLPTTNEQLQAVRI^GLVNRELLLNPKHPAPELLNLARFVK 
REEAKYRGTATSALMYEELFKMLZ 

ID29 945 bp 



TTGTTCTTAAAAAAGGAAAGAGAGGTAATCAGCATGCGTAAATGGACAAAAGGATTTCTCATCTTTGGTGTGGTG 
ACTACCGTTATCGGCTTTATCCTGCTTTTTGT^ 

AGAACCTGTCTATGATAGCCGTACGGAAAAGCTAACCTTTGGCAAGGAAGTCGAAAACCTAGAAATTACTCTCCA 
CC A AC AC ACGCTCACCATC AC AG A CTCITTCG ATGATC A A ATCC AC ATTTCTTACCATCCATCTCTTTCTG 

40 CATGATCTTATCACCAATCAGAACGATAGAACTCTGAGTCTCACrGATAAGAAACTGTCTGAAACTCCGTTTCT 
CTTCTGGAATTGGTGGGATTCTTCATATCGCAAGTAGCTACTCTAGTCGTTTTGAAGAAGTTATTCT 
AAAAGGGAGAACTCTAAAAGGGATCAACATCTCAGCCAATCGCGGACAAACCACCATCATAAATGCTAGCCTTGA 
AAATGCGACCCTCAATACAAACAGCTATATCCTCCGAATTGAAGGAAGTCGTATCAAAAACAGTAAACTCACAAC 
GCCCAATATCGTTAATATCTTTGATACAGTTCTTACAGATAGTCAGCrAGAGTCAACAGAGAATCACTTCCACGCT 

45 GAAAATATCCAAGTCCATGGCAAGGTTGAACTGACTGCCAAAGATTATCTCAGAATCATCCTAGACCAGAAAGAA 
AGCCAACGAATTAACTGGGACATCTCAAGCAACTATGGTTCTATCrrCCAATTCACAAGAGAAAAGCCTGAATCA 
AGAGGTACGGAATTAAGCAACCCITACAAAACTGAAAAAACCGATGTCAAGGATCAACTCATTGCGAGATCTGAT 
GATAATATTGATCTAATATCCACACCAAGCAGACGTTGA 

50 MFLKKEREVISMRKWTKGFLIFGVVTTVIGFILLFVGIQSDGIKSLLSMSKEPVYDSRTEKLTFGKEVENLEITLHQHTL^ 
TDSFDDQIHISYHPSLSAHHDLITNQNDRTLSLTDKKLSETPFLSSGIGGILHIASSYSSRFEEVILRLPKGRTLKGINISANR 
GOTTIINASLENATLNTNSYILR1EGSRIKNSKLTTPNIVNIFDTVLTDSQLESTENHFHAENIQVHGKVELTAKDYLRJILD 
QKESQRJNWDISSNYGSIFQFTREKPESRGTELSNPYKTEKTDVKDQLIARSDDNIDLISTPSRRZ 

55 ID30 879 bp 

atgaaacaagaatggtttgaaagtaatgattttgtaaaaacaacaagcaagaacaagcctgaagagcaagctca 
agaggttgcagacaaggctgaagaaacgatagccgatctcgatacaccaattgaaaaaaatactcagttagagg 
aggaagtccctcaagctgaagtcgaattggaaagccagcaagaagagaaaattgaagctcctgaagacagtgaa 
ou gcgagaacagaaatagaagaaaagaaggcatctaattctactgaagaagagccagaccritctaaagaaacaga 
aaaagtca ctat agctgaagagagccaagaagctcttcctcagcaaaaagcaaccacgaaagagccacitcttat 
cag taaat ctttagaaagtccttatatccccgaccaagctccaaaatctagggataaatggaaagagcaagtgct 
tgaj l i 1 1gg tcttggctagtgg aagcg atcaaatctcctacaagtaagttggaaacaagtatcacacacagttac 

AC AG CCTTTCTCTTG CTC ATTCTGTTTTCTG C ATCTTCC T Till CI T I'AGTATCT ATCACATCA AACATGCTTACTAT 
03 GGACATATAGCAAGCATTAACAGTCGCrrCCCTGAGCAGCTAGCTCCTTTAACT CI ' ri Tl I CTATCATCTCTATCCT 
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AGTAGCGACAACACTC 1 I CI I CTTT I CATTCCTCTTGGGTAGTTTCGTTGTGAGACGATTTATCCACCAGGAAAAG 
GACTGGACGCTAGACAAGGTTCTCCAACAATATAGTCAACTCTTGGCAATTCCAATCTCCTCACTGCTATTGCTAG 
rriCriUG C MU'l ClU ' l GATAGCCTACGATTTACAGCCCTCTTGTGTGTGA 

5 MKQEWFESNDFVKTTSKNKPEEQAQEVADKAEETIADLDTPIEKNTQLEEEVPQAEVELESQQEEKIEAPEDSEARTEIE 
EKKASNSTEEEPDl^KETEKVTIAEESQEALPOQKATTKEPLLISKSLESPYIPDQAPKSRDKWKEQVLDFWSWLVEADCS 
PTSKLETSITHSYTAFLLLILFSASSFFFSIYHIKHAYYGH1ASINSRFPEQLAPLTLFSIISILVATTLFFFSFLLGSFVVRRFIH 
QEKDWTLDKVLQQYSQLLAIPISSLLLLVSLLSLIAYDLQPSCVZ 

10 ID105 990 bp 

ATGCAACTCGCTTCTTCGGTCTACTCATTGTTCGTCTGGTACAA^ 
GCATGCGTAAATGGACAAAAGGATTTCTCATCTTTGGTGTGGTGAC™ 

GGTAT CCAA TCTGACGGGATTAAGAGCCTACTTTCCATGTCCAAAGAACCTGTCTATGATAGCCGTACGGAAAAG 

15 CTAACCTITGGCAAGGAAGTCGAAAACCTAGAAATTACTCT 

GATGATCAAATCCACATTTCTTACCATCCATCTCrTTCTGCTCACCATGATCTTATCACCAATCAGAA 
CTCrGAGTCTCACTGATAAGAAACTGTCTGAAACTCCGTTTCTCTCTTCTGGAATTGGT^ 
AAGTAGCTACTCTAGTCGTTTTGAAGAAGTTATTCTCCGACTACCAAAAGGGAGAACTCTAAAAGGG 
CTCAGCCAATCGCGGACAAACCACCATCATAAATGCTAGCCTTGAAAATGCGACCCTCAATACAAACAGCTATAT 

20 CCTCCGAATTGAAGGAAGTCGTATCAAAAACAGTAAACTCACAACGCGCAATATCGTTAATATCTITGATACAGTT 
CTTACAGATAGTCAGCTAGAGTCAACAGAGAATCACTTCCACGCTGAAAATATCCAAGTCCATGGCAAGGTTGAA 
CTGACTGCCAAAGATTATCTCAGAATCATCCTAGACCAGAAAGAAAGCCAACGAATTAACTGGGACATCTCAAGC 
AACTATGGTTCTATCTTCCAATTCACAAGAGAAAAGCCTGAATCAAGAGGTACGGAATTAAGCAACCCTTACAAA 
ACTGAAAAAACCGATGTCAAGGATCAACTCATTGCGAGATCTGATGATAATATTGATCTAATATCCACACCAAGC 

25 AGACGTTGA 

MQLASSVYSLFVWYNLFLKKEREVISMRXWTKGFLIFGVVTTVIGFILLI^GIQSDGIKSLI^MSKEPVYDSRTEKLTF^ 
KEVENLEITLHQHTLTrrDSFDDQIHISYHPSLSAHHDLITNQNDRTLSLTDKKLSETPFLSSGIGGILHIASSYSSRFEEVIL 
RLPKGRTLKGINISANRGQTTIINASLENATLNTNSYILRIEGSRIKNSKLTTPNIVNIFDTVLTDSQLESTENHFHAENIQV 
30 HGKVELTAKDYLRIILDQKESQR1NWDISSNYGSIFQFTREKPESRGTELSNPYKTEKTDVKDQL1ARSDDNIDLISTPSRR 
Z 



35 



40 



60 



ID107 -78bp 

ATGATATGTAAAATGAAGCAGGGAGGGAGCAGGGCGTGCTGGGGATGGAGAGTGGGGGAGGGACGCTGCTATTT 
TAATC 

MICKMKQGGSRACWGWRVGEGRCYFN 
ID109 714 bp 



CGATAAAGAGGCCTTGAGTAATCTCAATTTGCAGATTGAAAATGGAGAGATTATGGGCTTGATTGGTCATAATGG 
45 GGCTGGAAAATCGACCACTATAAAATCCCTAGTCAGTATCATTTCACCCAGCAGTGGTCGTATTTTGGTAGACGGT 

CAGGAGTTATCGGAAAATCGCTTGGCTATTAAACGAAAGATTGGCTACGTAGCAGACTCGCCTGACTTATTTTTAC 

GCTTAACGGCCAATGAATTTTGGGAATTGATCGCCTCATCCTATGATCTGAGTAGATCTGACTTGGAGGCTAGTCT 

AGCTAGGCTATTGAACGTTTTTGATTTTGCTGAAAATCGCT 

CAGAAAGTCTTTGTCATCGGAGCACTCTTGTCTGATCCCGATATTTGGG 
50 ATCCCCAGGCTGCCTTTGATTTGAAACAGATGATGAAGGAACATGCACAAAAAGGGAAGACAGTCTTGTTTTCAA 

CTCATGTCCTAGAGGTGGCAGAGCAAGTCTGTGATCGGATTGCCATTTTGAAAAAGGGGCATTTGATTTATTGTGG 

TAAGGTAGAGGACTTGAGGAAAGACCACCCAGACCAGTCITTGGAAAGTATCTACCTTAGTCTTGCTGGTAGAAA 

AGAGGAGGTTGCGGATGCGTCTCAAGGTCATTAA 

55 DKEALSNLNLQIENGEIMGLIGHNGAGKSTTIKSLVSHSPSSGRILVDGQELSENRLAIKRKIGYVADSPDLFLRLTANEF 
WELIASSYDLSRSDLEASLARLLNVFDFAENRYQVIETLSHGMRQKVFVIGALLSDPDIWVLDEPLTGLDPOAAFDLKQ 
MMKEHAQKGKTVLFSTHVLEVAEQVCDRIAILKKGHLIYCGKVEDLRKDHPDQSLESIYLSLAGRKEEVADASQGHZ 



1D112 360 bp 



ATGGCTTTGTTTTCAGAGAGAGGAGCAGTACGGAAGACACCAATGGCAAGTCCAATAATGAGACCTATGATGGTT 
CCGACGATAGAGATTAAAAGAGTGATACCAGCACCACGCAAGAGTTGTTGCCAGTTTTCAGAAAGAATTTTAGCA 
ACTTGGCTAAAGAAACTACTGCTAGTCrCTTCAGTTGTTGTAGCT^ 
TCAAGGCAACTTGGTCATCTTTTGAAATGGTTTCAATGCTGGCATTGATTTC 
OD AGCCCGATAGCGATAGCTGTATCTTCTTCCCCAGTTTTGAAACCAGGTTCTACTTGA 
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MALFSERGAVRKTPMASPIMRPMMVPTIEIKRVIPAPRKSCCQFSERILATWLKKLLLVSSVVVASAGCSLIIRSIKATWS^ 
FEMVSMLALIWLIRLSFLRSPIAIAVSSSPVLKPGSTZ 

ID 128 - 3.43 

ATGAAATTTAGTAAAAAATATATAGCAGCTGGATCAGCTGTTATCGTATC 

CTTGAGTCTATGTGCCTATGCACTAAACCAGCATCGTTCGCAGGAAAATA 

AGGACAATAATCGTGTCTCTTATGTGGATGGCAGCCAGTCAAGTCAGAAA 

AGTGAAAACTTGACACCAGACCAGGTTAGCCAGAAAGAAGGAATTCAGGC 

TGAGCAAATTGTAATCAAAATTACAGATCAGGGCTATGTAACGTCACACG 

GTGACCACTATCATTACTATAATGGGAAAGTTCCTTATGATGCCCTCTTT 

AGTGAAGAACTCTTGATGAAGGATCCAAACTATCAACTTAAAGACGCTGA 

TATTGTCAATGAAGTCAAGGGTGGTTATATCATCAAGGTCGATGGAAAAT 

ATTATGTCTACCTGAAAGATGCAGCTCATGCTGATAATGTTCGAACTAAA 

GATGAAATCAATCGTCAAAAACAAGAACATGTCAAAGATAATGAGAAGGT 

TAACTCTAATGTTGCTGTAGCAAGGTCTCAGGGACGATATACGACAAATG 

ATGGTTATGTCTTTAATCCAGCTGATATTATCGAAGATACGGGTAATGCT 

TATATCGTTCCTCATGGAGGTCACTATCACTACATTCCCAAAAGCGATTT 

ATCTGCTAGTGAATTAGCAGCAGCTAAAGCACATCTGGCTGGAAAAAATA 

TG CAA CGG A G TC AG TT A A G CT A TT CTT C A A C A G CT A G T G A C A~A T A A C A CO 

CAATCTGTAGCAAAAGGATCAACTAGCAAGCCAGCAAATAAATCTGAAAA 

TCTCCAGAGTCTTTTGAAGGAACTCTATGATTCACCTAGCGCCCAACGTT 

ACAGTGAATCAGATGGCCTGGTCTTTGACCCTGCTAAGATTATCAGTCGT 

ACACCAAATGGAGTTGCGATTCCGCATGGCGACCATTACCACTTTATTCC 

TTACAGCAAGCTTTCTGCCTTAGAAGAAAAGATTGCCAGAATGGTGCCTA 

TCAGTGGAACTGGTTCTACAGTTTCTACAAATGCAAAACCTAATGAAGTA 

GTGTCTAGTCTAGGCAGTCTTTCAAGCAATCCTTCTTCTTTAACGACAAG 

TAAGGAGCTCTCTTCAGCATCTGATGGTTATATTTTTAATCCAAAAGATA 

TCGTTGAAGAAACGGCTACAGCTTATATTGTAAGACATGGTGATCATTTC 

CATTACATTCCAAAATCAAATCAAATTGGGCAACCGACTCTTCCAAACAA 

TAGTCTAGCAACACCTTCTCCATCTCTTCCAATCAATCCAGGAACTTCAC 

ATG AG AA AC ATG A A G A A G ATG G AT A CGG ATTTG ATG CT AATCGT ATT ATC 

G CTG A AG ATG A ATC A GGTTTTGTC ATG AGTC A CG G AG AC C A C A ATC ATT A 

TTTCTTCAAGAAGGACTTGACAGAAGAGCAAATTAAGGTGCGCAAAAACA 

TTTAG 

MKFSKKYiAAGSAVIVSLSLCAYALNQHRSQENKDNNRVSYVDGSQSSQK 

SENLTPDQVSQKEGIQAEQIVIKITDQGYVTSHGDHYHYYNGKVPYDALF 

SEELLMKDPNYQLKDADIVNEVKGGYIIKVDGKYYVYLKDAAHADNVRTK 

DEINRQKQEHVKDNEKVNSNVAVARSQGRYTTNDGYVFNPADIIEDTGNA 

YIVPHGGHYHYIPKSDLSASELAAAKAHLAGKNMQPSQLSYSSTASDNNT 

QSVAKGSTSKPANKSENLQSLLKELYDSPSAQRYSESDGLVFDPAKIISR 

TPNGVAIPHGDHYHFIPYSKLSALEEKIARMVPISGTGSTVSTNAKPNEV 

VSSLGSLSSNPSSLTTSKELSSASDGYIFNPKDIVEETATAYIVRHGDHF 

HYIPKSNOIGQPTLPNNSLATPSPSLPINPGTSHEKHEEDGYGFDANRII 

AEDESGFVMSHGDHNHYFFKKDLTEEQIKVRKNI* 
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TABLE 2 



1D2 840 bp 

5 ATG GGAA TTGCTCTAGAAAATGTGAATTTTACATATCAAGAA 

TTTCTT1 G ACG ATTG A AG ATGGCTCnTAT AC AG CTTTAATTGGGCACACAGGTAGTGGT A AATC AACTATTTTACA 
ACTCTTAAATGGTTTATTGGTGCCAAGTCAAGGGAGTGTGAGGGTTTTTGATACCTTAATCACCTCGACTTCT 
AAT A AAG AT ATTCGTC A A ATT AG AA A AC AG GTTG G CTTG GT ATTTC AGTTTG CTG AA A ATC AG ATTTTTG AAG AA A 
CGGTTTTGAAGGACGTTGCTTTTGGACCGCAAAATTTTGGAGTTTCrGAAG^ 

10 GAAACTGGCTCTGGTTGGAATTGATGAATCACTTTTTGATCGTAGTCCGTTTGAGCT 

CGTGTTGCCATTGCAGGCATACTTGCCATGGAGCCAGCTATATTAGTCTTAGATGAGCCAACAGCTGGTCTAGATC 
CTCTAGGGAGAAAAGAGTTGATGACCCTGTTCAAAAAACTCCACCAGTCAGGGATGACCATCGTCTTGGTAACGC 
ATTTGATGGATGA TGTT GCTGAATATGCGAATCAAGTCTATGTAATGGAAAAGGGACGTTTAGTAAAGGGGGGCA 
AACCAAGTGATGTCTTTCAAGACGTTGTTTTTATGGAAGAAGTTCAG 

15 TAAACGATTGGCTGATAGAGGCGTGTCATTTAAACGATTACCGATTAAGATAGAGGAGTTCAAGGAGTCGCTAAA 
TGGATAG 



MGIALENVNFTYQEGTPLASAALSDVSLTIEDGSYTALIGHTGSGKSTILQLLNGLLVPSQGSVRVFDTLITSTSKNKDIR 
QIRKQVGLVFQFAENQIFEETVLKDVAFGPQNFGVSEEDAVKTAREKLALVGIDESLFDRSPFELSGGQMRRVAIAGILA 
20 MEPAILVLDEPTAGLDPLGRKELMTLFKKLHQSGMTIVLVTHLMDDVAEYANQVYVMEKGRLVKGGKPSDVFQDVV 
FMEEVQLGVPKITAFCKRLADRGVSFKRLPIKIEEFiCESLNGZ 

ID 3 6360 bp 

25 TACCCGGTAGTCnTAGCAGACACATCTAGCTCTGAAGATGCTTTAAACATCTCTGATAAAGAAAAAGTAGCAGAA 
AATAAAGAGAAACATGAAAATATCCATAGTGCTATGGAAACTTCACAGGATTTTAAAGAGAAGAAAACAGCAGTC 
ATTAAGGAAAAAGAAGTTGTTAGTAAAAATCCTGTGATAGACAATAACACTAGCAATGAAGAAGCAAAAATCAA 
AGAAGAAAATTCCAATAAATCCCAAGGAGATTATACGGACTCATTTGTGAATAAAAACACAGAAAATCCCAAAAA 
AGAAGATAAAGTTGTCT ATAT TGCTGAATTTAAAGATAAAGAATCTGGAGAAAAAGCAATCAAGGAACTATCCAG 
30 TCTTAAGAATACAAAAGTTTTATATACrrATGATAGAATTTTTAACGGTAGTGCCATAGAAACAACT 

TTGGACAAAATTAAACAAATAGAAGGTATTTCATCGGTTGAAAGGGCACAAAAAGTCCAACCCATGATGAATCAT 
GCCAGAAAGGAAATTGGAGTTGAGGAAGCTATTGATTACCrAAAGTCTATCAATGCrCCGTTTGGGAAAAATTTT 
GATGGTAGAGGTATGGTCATTTCAAATATCGATACTGGAACAGATTATAGACATAAGGCTATGAGAATCGATGAT 
GATGCCAAAGCCTCAATGAGATTTAAAAAAGAAGACTTAAAAGGCACTGATAAAAATTATTGGTTGAGTGATAAA 
35 ATCCCTCATGCGTTCAATTATTATAATGGTGGCAAAATCACTGTAGAAAAATATGATGATGGAAGGGATTATTTTG 
ACCC A C ATGG G ATG C AT ATTG C AG G G ATTCTTG CTG G A A ATG AT ACTG AA C AAG AC ATC A A AAA CTTT A A CGGC A 
TAGATGGAATTGCACCTAATGCACAAATTTTCTCTTACAAAATGTATTCTGACGCAGGATCTGGGTTTGCGGGTGA 
TGAAACAATGTTTCATGCTATTGAAGATTCTATCAAACACAACGTTGATGTTGTTTCGGTATCATCTGGTTTTACA 
GGAACAGGTCnTGTAGGTGAGAAATATTGGCAAGCTATTCGGGCATTAAGAAAAGCAGGCATTCCAATGGTTGTC 
40 GCTACGGGTAACTATGCGACTTCTGCTTCAAGTTCTTCATGGGATTTAGTAGCAAATAATCATCTGAAAATGACCG 
ACACTGGAAATGTAACACGAACTGCAGCAC ATGAA GATGCGATAGCGGTCGCTTCTGCTAAAAATCAAACAGTTG 
AGTTTG ATAAAGTTAACATAGGTGGAGAAAGTTTTAAATACAGAAATATAGGGGCCTTTTTCGAT AAG AGTAAAA 
TCACAACAAATGAAGATGGAACAAAAGCTCCTAGTAAATTAAAATTTGTATATATAGGCAAGGGGCAAGACCAAG 
ATTTGATAGGTTTGGATCTTAGGGGCAAAATTGCAGTAATGGATAGAATTTATACAAAGGATTTAAAAAATGCTTT 
45 TAAAAAAGCTATGGATAAGGGTGCACGCGCCATTATGGTTGTAAATACTGTAAATTACTACAATAGAGATAATTG 
GACAGAGCTTCCAGCTATGGGATATGAAGCGGATGAAGGTACTAAAAGTCAAGTGTTTTCAATTTCAGGAGATGA 
TGGTGTAAAGCTATGGAACATGATTAATCCTGATAAAAAAACTGAAGTCAAAAGAAATAATAAAGAAGATTTTAA 
AGATAAATT GGAG CAATACTATCCAATTGATATGGAAAGTTTTAATTCCAACAAACCGAATGTAGGTGACGAAAA 
AGAGATTGACTTTAAGTTTGCACCTGACACAGACAAAGAACTCTATAAAGAAGATATCATCGTTCCAGCAGGATC 
5U TACATCTTGGGGGCCAAGAATAGATTTACTTTTAAAACCCGATGTTTCAGCACCTGGTAAAAATATTAAATCCACG 
CTTAAT GTTA TTAATGGCAAATCAACITATGGCTATATGTCAGGAACTAGTATGGCGACTCCAATCGTGGCAGCTT 
CTACrGTTTrGATTAGACCGAAATTAAAGGAAATGCTTGAAAGACCTGTATTGAAAAATCTTAAGGGAGATGACA 
AAATAGATCTTACAAGTCTTACAAAAATTGCCCTACAAAATACTGCGCGACCrATGATGGATGCAA 
AAGAAAAAAGTCAA TACIT TGCATCACCTAGACAACAGGGAGCAGGCCTAATTAATGTGGCCAATGCTTTGAGAA 
ATGAAGTTGTAGCAACTTTCA AAAA CACTGATTCTAAAGGTTTGGTAAACTCATATGGTTCCATTTCT 
AATAAAAGGTGATAAAAAATACTTTACAATCAAGCTTCACAATACATCAAACAGACCITTGACTTTTAAAGTTTCA 
GCATCAGCGATAACTACAGATTCTCTAACTGACAGATTAAAACTTGATGAAACATATAAAGATGAAAAATCTCCA 
GATGGTAAGCAAATTGTTCCAGAAATTCACCCAGAAAAAGTCAAAGGAGCAAATATCACATTTGAGCATGATACT 
TTCACTATAGGCGCAAATr CTAGC TTTGATTTGAATGCGGTTATAAATGTTGGAGAGGCCAAAAACAAAAATAAA 
OU TTTGTAGAAT CATT TATTCATTTTGAGTCAGTGGAAGCGATGGAAGCTCTAAACTCCAGCGGGAAGAAAATAAAC 
TTCCAACCTTCTTTGTCGATGCCTCTAATGGGATTTGCTGGGAATTGGAACCACGAACCAATCCTTGATAAATGGG 
CTTGGGAAGAAGGGTCAAGATCAAAAACACTGGGAGGTTATGATGATGATGGTAAACCGAAAATTCCAGGAACCT 
TAAATAAGGGAATTGGTGGAGAACATGGTATAGATAAATTTAATCCAGCAGGAGTTATACAAAATAGAAAAGATA 
AAAATACAACATCCCTGGATCAAAATCCAGAATTATTTGCTTTCAATAACGAAGGGATCAACGCTCCATCATCAA 
CO GTGGTTCTAAGATTGCTAACATTTATCCTTTAGATTCAAATGGAAATCCTCAAGATGCTCAACTTGAAAGAGGATT 
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AACACCTTCTCCACTTGTATTAAGAAGTGCAGAAGAAGGATTGATTTCAATAGTAAATACAAATAAAGAGGGAGA 
AAATCAAAGAGACTTAAAAGTCATTrCGAGAGAACACTTTATTAGA 

AAAGGGAATCAAATCATCTAAACTAAAAGTTTGGGGTGACTTGAAGTGGGATGGACTCATCTATAATCCTAGAGG 
TAGAGAAGAAAATGCACCAGAAAGTAAGGATAATCAAGATCCTGCTACTAAGATAAGAGGTCAATTTGAACCGAT 
5 TGCGGAAGGTCAATATTTCTATAAATTTAAATATAGATTAACTAAAGATTACCCATGGCAGGTTTCCTATATTCCT 
GTAAAAATTGATAACACCGCCCCTAAGATTGTTTCGGTTGATTTTTCAAATCCTGAAAAAATTAAGTTGA 
AGGATACTTATCATAAGGTAAAAGATCAGTATAAGAATGAAACGCTATTTGCGAGAGATCAAAAAGAACATCCTG 
AAAAATTTGACGAGATTGCGAACGAAGTTTGGTATGCTGGCGCCGCTCTTGTTAATGAAGATGGAGAGGTTGAAA 
AAAATCTTGAAGTAACITACGCAGGTGAGGGTCAAGGAAGAAATAGAAAACTTGATAAAGACGGAAATACCATTT 

10 ATGAAATTAAAGGTGCGGGAGATTTAAGGGGAAAAATCATTGAAGTCATTGCATTAGATGGTTCTAGCAATTTCA 
CAAAGATTCATAGAATTAAATTTGCTAATCAGGCTGATGAAAAGGGGATGATTTCCTATTATCTAGTAGATCCTGA 
TCAAGATTCATCTAAATATCAAAAGCTTGGCGAGATTGCAGAATCTAAATTTAAAAATTTAGGAAATGGAAAAGA 
GGG TAGTC TAAAAAAAGATACAACTGGGGTAGAACATCATCATCAAGAAAATGAAGAGTCTATTAAAGAAAAAT 
CTAGTTTTACTATTGATAGAAATATTTCAACAATTAG 

15 GAAATTTAGAGAAGTTGATGATTTTACAAGTGAAACTGGTAAGAGAATGGAGGAATACGATTATAAATACGATGA 
TAAAGGAAATATAATAGCCTACGATGATGGGACTGATCTAGAATATGAAACTGAGAAACTTGACGAAATCAAATC 
AAAAATTTATGGTGTTCTAAGTCCGTCTAAAGATGGACACTTTGAAATTCTTGGAAAGATAAG 
A ATG CC A AG GT AT ATT ATGGG A ATA A CT AT A A ATCT AT AG A AATC AAAG CG ACC AAGT ATG ATTTCC ACTC AAAA 
ACGATGACATTTGATCTATACGCTAATATTAATGATATTGTGGATGGATTAGCITTTGCAGGAGATATGAGATTAT 

20 TTGTTAAAGATAATGATCAGAAAAAAGCTGAAATTAAAATTAGAATGCCTGAAAAAATTAAGGAAACTAAATCAG 
AATATCCCTATGT^CAAGTTAT^GC^ 

AIWAACTAAAATGGAATCTGGTAAAATCTATTCTGATTCAGAAAAACAACAATATCT^ 

TCTAAGAAAAGGCTATGCACTAAAAGTGACTACCTATAATCCTGGAAAAACGGATATGTTAGAAGGAAATGGAGT 
CTATAGCAAGGAAGATATAGCAAAAATACAAAAGGCCAATCCTAATCTAAGAGCCCTTTCAGAAACAACAATTTA 
25 TGCTGATAGTAGAAATGTTGAAGATGGAAGAAGTACCCAATCTGTATTAATGTCGGCTTTGGACGGCTTTAACATT 
ATAAGGTATCAAGTGTTTACATTTAAAATGAACGATAAAGGGGAAGCTATCGATAAAGACGGAAATCTTGTGACA 
GATTCTTCTAAj\CTTGTATTATTTGGTAAGGATGATAAAG 

TAAAAGAj\GATGGCTCCATGTTATTTATTGATACCAAACCAGTAAACCTTTCAATGGATAAGAACTACTTTAATCC 

atctaaatctaataaaatttatctacgaaatccagaattttatttaagaggtaagatttct 
30 aact ggga attgagagttaatgaatcggttgtagataattatttaatctacggagatttacacattgataacacta 
gagattttaatattaagctgaatgttaaagacggtgacatcatggactggggaatgaaagactataaagcaaacg 
gatitccagataaggtaac agata tggatggaaatgtttatcttcaaactggctatagcgatttgaatgctaaagc 
agttggagtccactatcagttittatatgataatgttaaacccgaagtaaacattgatccraagggaaatactagt 
atcgaatatgctgatggaaaatctgtagtcittaacatcaatgataaaagaaataatggattcgatggtgagatt. 
35 caagaacaacatatttatataaatggaaaagaatatacatcatttaatgatattaaacaaataatagacaagaca 
ctaaacattaagattgttgtaaaagattttgcaagaaatacaaccgtaaa^ 

ggagaggtaagtg aattaaaacct catagggtaactgtgaccattcaaaatggaaaagaaatgagttcaacgata 
gtgtcggaagaagattttattttacctgtttataagggtgaattagaaaaaggataccaatttgatggttggga^ 
tttctggtttcgaaggtaaaaaagacgctggctatgttattaatctatcaaaagatacct^ 
40 caagaaaatagaggagaaaaaggaggaagaaaataaacctacttttgatgtatcgaaaaagaaagataacccac 
aagtaaaccatagtcaattaaatgaaagtcacagaaaagaggatttacaaagagaagagcattcacaaaaatct 
gattcaactaaggatgttacagctacagttcttgataaaaacaatatcagtagtaaatcaactactaacaatcct 
a ataag ttgccaaaaactggaacagcaagcggagcccagacactattagctgccggaataatgtttatagtagga 
atttttcttggattgaagaaaaaaaatcaagattaa 

45 

YPVVLADTSSSEDALNISDKEKVAENKEKHENIHSAMETSQDFKEKKTAVIKEKEVVSKNPVIDNNTSNEEAKIKEENSN 
KSQGDYTDSFVNKNTENPKKEDKVVYIAEFKDKESGEKAIKELSSLKNTKVLYTYDR1FNGSAIETTPDNLDKIKQIEGIS 
SVERAQKVOPMMNHARKEIGVEEAIDYLKSINAPFGKNFDGRGMVISNIDTGTDYW1ICAMR1DDDAKASMRFKKEDL 
KGTDKNYWLSDKIPHAFNYYNGGKITVEKYDDGRDYFDPHGMHIAGILAGNDTEQDIKNFNGIDGIAPNAQIFSYKMY 

50 SDAGSGFAGDETMFHAIEDSIKHNVDVVSVSSGFTGTGLVGEKYWQAIRALRKAGIPMVVATGNYATSASSSSWDLVA 
NNHLKMTDTGNVTRTAAHEDAIAVASAKNQTVEFDKVNIGGESFKYRNIGAFFDKSKITTNEDGTKAPSKLKFVYIGK 
GQDQDLIGLDLRGKIAVMDRIYTKDLKNAFKKAMDKGARAIMVVNTVNYYNRDNWTELPAMGYEADEGTKSQVFSI 
SGDDGVKLWNMINPDKKTEVKRNNKEDFKDKLEQYYPIDMESFNSNKPNVGDEKEIDFKFAPDTDKELYKEDIIVPAG 
STSWGPR1DLLLKPDVSAPGKNIKSTLNVINGKSTYGYMSGTSMATPIVAASTVLIRPKLKEMLERPVLKNLKGDDKIDL 

DD TSLTKIALQNTARPMMDATSWKEKSQYFASPRQQGAGLINVANALRNEVVATFKNTDSKGLVNSYGSISLKEIKGDKK 
YFTIKLHNTSNRPLTFKVSASAITTDSLTDRLKLDETYKDEKSPDGKQIVPEIHPEKVKGANITFEHDTFTIGANSSFDLN 
AVINVGEAKNKNKFVESFIHFESVEAMEALNSSGKKINFQPSLSMPLMGFAGNWNHEPILDKWAWEEGSRSKTLGGYD 
DDGKPKIPGTLNKGIGGEHGIDKFNPAGVIQNRKDKNTTSLDQNPELFAFNNEGINAPSSSGSKIANIYPLDSNGNPQDA 
QLERGLTPSPLVLRSAEEGLISIVNTNKEGENQRDLKVISREHFIRGILNSKSNDAKGIKSSKLKVWGDLKWDGLIYNPRG 

OU REENAPESKDNQDPATKIRGQFEPIAEGQYFYKFKYRLTKDYPWQVSYIPVKIDNTAPKIVSVDFSNPEKIKUTKDTYHK 
VKDQYKNETLFARDQKEHPEKFDEIANEVWYAGAALVNEDGEVEKNLEVTYAGEGQGRNRKLDKDGNTIYEIKGAG 
DLRGKHEVIALDGSSNFTKIHRIKFANQADEKGMISYYLVDPDQDSSKYOKLGEIAESKFKNLGNGKEGSLKKDTTGVE 
HHHQENEESIKEKSSFTIDRNISTIRDFENKDLKKLIKKKFREVDDFTSETGKRMEEYDYKYDDKGNIIAYDDGTDLEYE 
TEKLDEIKSKIYGVLSPSKDGHFEILGKISNVSKNAKVYYGNNYKSIEIKATKYDFHSKTMTFDLYANINDIVDGLAFAG 

05 DMRLFVKDNDQKKAEIKIRMPEKIKETKSEYPYVSSYGNVIELGEGDLSKNKPDNLTKMESGKIYSDSEKQQYLLKDNII 



SUBSTITUTE SHEET (RULE 26) 



WO 00/06738 



PCT/GB99/02452 



42 



10 



25 



50 



LRKGYALKVTTYNPGCTDMLEGNGVYSKEDIAKIQKANPNLRALSETTIYADSRNVEDGRSTQSVLMSALDGFNIIRYQ 

VFTFKMNDKGEAIDKDGNLVTDSSKLVLFGKDDKEYTGEDKFNVEAIKEDGSMLFIDTKPVNLSMDKNYFNPSKSNKI 

YVRNPEFYLRGKISDKGGFNWELRVNESVVDNYLIYGDLHIDNTTRDFNIKLNTVKIX3DIMDWGMKDYKANGFPDKVTD 

MDGNVYLQTGYSDLNAKAVGVHYQFLYDNVKPEVNIDPKGNTSIEYADGKSVVFNINDKRNNGFDGE1QEQHIYINGK 

EYTSFNDIKQIIDKTLNIKIVVKDFARmTVKEFILNKDTGEVSELKPHRVTVTIQNGKEMSSTIVSEEDFILPVYKGELEK 

GYQFDGWEISGFEGKKDAGYVINLSKDTFIKPVFKKIEEKKEEENKPTFDVSKKKDNPQVNHSQLNESHRKEDLQREEH 

SQKSDSTKDVTATVLDKNNISSKSTTNNPNKLPKTGTASGAQTLLAAGIMFIVGIFLGLKXKNQDZ 

ID6 597 bp 



CTTGAATTAAATAAAAAACGTCATGCGACTAAGCATTTTACTGATAAGCTTGTTGATCCCAAAGATGTGCGTACGG 
CTATCGAAATTGCAACCTTAGCGCCAAGCGCCCACAACAGCCAGCCTTGGAAATTTGTGGTGGTACGTGAGAAAA 
ATGCTGAACTGGCAAAGTTAGCrTATGGTTCCAATTTTGAACAGGTATCATCAGCGCCTGTAACCA^ 
TACAGAT ACGGA CTTAGCCAAACGTGCTCGTAAGATTGCCCGTGTTGGTGGTGCTAATAACTTTTCTGAAGAGCAA 
15 CTrCAATATTTTATGAAAAATCTGCCAGCTGAGTTTGCCCGTTACAGTGAGCAACAAGTCAGCGACTACCTAGCTC 
TCAA TGCAGGTTTGGTTGCCATGAAC TTGG TTCTTGCATTC 

TTTTGACAAATCAAAAGTTAATGAAGTTTTGGAAATCGAAGACCGTTTCCGCCCAGAACTCTTGATCACAGTGGGT 
TATACAGACGAAAAATTGGAACCAAGCTACCGCTTGCCAGTAGATGAAATCATCGAGAAAAGATAG 

20 LELNKKRHATKHFTDKLVDPKDVRTAIEIATLAPSAHNSQPWKFVVVREKNAELAKLAYGSNFEQVSSAPVTIALFTDT 
DLAKRARKIARVGGANNFSEEQLQYFMKNLPAEFARYSEQQVSDYLALNAGLVAMNLVLALTDQGIGSNIILGFDKSK 
VNEVLEIEDRFRPELLITVGYTDEKLEPSYRLPVDEIIEKR2 



ID7 1401 bo 



ATGACAGCAATTGATTTTACAGCAGAAGTAGAAAAACGCAAAGAAGACCTCT^^ 

GAAATCAATTCAGAACGTGATGACAGCAAGGCTGATGCCCAGCATCCATTTGGGCCTGGTCCAGTAAAAGCCTTG 
GAGAAATTCCTTGAAATCGCAGACCGCGATGGCTACCCAACTAAGAATGTTGATAACTATGCAGGACATTTTGAG 
TTTGGTGATGGAGAAGAAGTTCTCGGAATCTTTGCCCATATGGATGTGGTGCCTGCTGGTAGCGGTTGGGACACAG 
30 ACCCTTACACACCAACTATCAAAGATGGTCGCCTTTATGCGCGCGGGGCTTCGGACGATAAGGGTCCTACAACAG 
CTTGTTACTATGGTTTGAAAATCATCAAAGAATTGGGTCITCCAACrTCT 

AGACGAAGAATCAGGCTGGGCAGACATGGACTACTACTTTGAGCACGTAGGACnTGCCAAACCAGATTTCGGTTT 
CTCACCAGATGCTGAATTTCCAATCAT CAAT GGTGAAAAAGGAAATATCACGGAATACCTCCACTTTGCAGGAGA 
AAATACAGGTGTTGCCCGTCTTCACAGCITTACAGGTGGTTTACGTGAAAATATGGTACCAGAATCAGCAACAGC 

35 AGTCGTTTCAGGTGACTTGGCTGACITGCAAGCTAAACT^ 

ACTCCAAGAAGAAGCTGGCAAATACAAGGTGACGATCATTGGTAAATCAGCCCACGGTGCTATGCCTGCTTCAGG 
TGTCAATGGCGCAACTTACCTTGCCCTCTTCCTCAGCCAGTTTGGCnTTGCTGGTCCAGCCAAAGACT 
ATC GCAG GTAAAATTCTCTrGAACGATCATGAGGGTGAAAATCTTAAGATTGCTCATGTGGATGAAAAGATGGGT 
GCTCTTTCTATGAATGCCGGCGTCTTCCACTTCGATGAAACAAGTGCTGATAATACCATTGCCCTCAACATCCGCT 

40 ATCCAAAAGGAACAAGTCCAGAACAAATCAAGTCAATCCTTGAAA^ 

ACACGG TCAC ACGCCTCACTATGTGCCAATGGAAGATCCACTTGTGCAAACCTTGTTGAATATCTATGAAAAACA 
AACTGGCTTTAAAGGTCATGAACAAGTCATCGGTGGTGGAACCTTTGGTCGCTTGCTAGAACGCGGAGTTGCCTA 
CGGTGCTATGTTCCCAGACTCGATTGATACCATGCACCAAGCCAATGAATTTATCGCCTTGGATGATCTTTTCCGA 
GCAGCAGCAATTTATGCCGAAGCTATTTACGAATTGATCAAATAA 

45 

MTAIDFTAEVEKRKEDLLADLFSLLEINSERDDSKADAQHPFGPGPVKALEKFLEIADRDGYPTKNVDNYAGHFEFGD 
GEEVLGIFAHMDVVPAGSGWDTDPYTPTIKDGRLYARGASDDKGPTTACYYGLKIIKELGLPTSKKVRFIVGTDEESGW 
ADMDYYFEHVGLAKPDFGFSPDAEFPIINGEKGNrTEYLHFAGENTGVARLHSFTGGLRENMVPESATAVVSGDLADL 
QAKLDAFVAEHKLRGELQEEAGKYKVTIIGKSAHGAMPASGVNGATYLALFLSQFGFAGPAKDYLDIAGKILLNDHEG 
ENLKIAHVDEKMGALSMNAGVFHFDETSADNTriALNIRYPKGTSPEOIKSILENLPVVSVSLSEHGHTPHYVPMEDPLVQ 
TLLNIYEKQTGFKGHEQVIGGGTFGRLLERGVAYGAMFPDSIDTMHQANEFIALDDLFRAAAIYAEAIYELIKZ 

ID8 1617 bp 

55 GTGTATACTATTATAAAATCAAATATAAAAAAATTTAGTTTATTAACG 

AATTTATGCAGCAACTATTAATGCTCTGGTGTTGAATGAATTAATTGCGATGAATTTAGAGCGGTTTTTGA 

TCAATCTACCAAATGATTGTCTGGTGTGGGATAATATTCCTTGACTGGGTAGTGAAAAATTATCAGGTTGAAGTGA 

TCCAAGAGTTTAATCTAGAGATTCGAAATAGAGTTGCCACAGACATCTCTAACTCTACCTATCAAGAATTTCATAG 

/zr\ TAAATCATCAGGAACATATCTTTCGTGGCTAAATAATGATGTTCAGACTTTAAATGATCAGGCGTTTAAACAACTT 

OU i I I 1 l AGTAATAAAAGGAATTTCTGGTACTATATTTGCAGTTGTGACTCnTAATCACTATCATTGGTCATTGACT 

AGCCACCTTGTTTTCATTAATG ATTATGCT ACTTGTACCAAAAATCnTTGCATCGAAAATGCGAGAAGTTAGTCTA 
AATTTAACTAACCAAAATGAAGCI'T ri'ri AAAATCTAGTGAGAC^ATATTGAATGGATTTGATGTGTTAGCGTCCT 
TGAATCTTTTATATGTATTGCCTAAGAAAATTAAAGAAGCAGGAATTTTATTAAAGATGGTTATACAAAGAAAGA 
CAACTGTAGAAACGTTAGCAGGCG.CTATTAGCTTCTTTCTCAATATTTTTTTTCAGATATCTCT 

CO GGCTATCTTGCAATAAAAGGAATAGTGAAAATTGGTACTATTGAAGCAATAGGAGCACTAACAGGTGTTATTTTT 
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ACAGCGCTAGGTGAATTAGGAGGTCAATTATCCTCTATTATTGGTACGAAGCCTAI'rrrri'i AAAATTGTATTCAA 

TTAATCCAATTGAGTCAAATAAAATGAATGATATCGAACCAAATGAGGTGAATAGAGATTTTCCGTTATATGAAG 

CAAAAAATATTTGCTATAAGTATGGAGATAAAGAAATATTAAAAAACTTAAATTTTTGTTTTCAACGTAATGAAAA 

GTATTTAATTTTAGGTGAAAGTGGAAGCGGGAAATCTACATTATTAAAATTATTGAATGGCTTTTTGAG 

AGTGGAGAATTGCGATTCTGCGGGGATGATATAAAAAAAACCTCCTATTTAAATATGGTTTCGAATGTTCT 

T AG ATC AAAA AGCTT ATTTGTTTG A AG GT ACG ATT AG AG ATA AT ATTTT ATTGG AAG A AAATT AT A CTG ATG A AG A 

AATACTACAGTCTTTAGAGCAAGTTGGTTTGAGTGTAAAAGATTTTCCTAATAACATTTTAGATTATT 

GATGATGGGAGATTACTGTCAGGAGGGCAGAAACAAAAAATTACTTTAGCTAGAGGGCTAATTAGAAATAAGAA 

AATAGTATTAATTGACGAGGGAACTTCTGCTATCGATAGGAGAACTTCGTTAGCGATTGAACGTAAGATATTAGA 

TAGAGAGGATTTGACTGTCATTATTGTTACCCATGCrCCGCATCCGGAACTTAAACAATATTTTACTA 

CAATTTCCAAAGG ATTTT ATTTAA 

MYTIIKSNKKFSLLTIFIVAGQLLLIYAATINALVLNELIAMNLERFLKI^IYOMIVWCGIIFLDWVVKNYQVEVIQEFNL 
EIRNRVATDISNSTYQEFHSKSSGTYLSWLNNDVQTLNDQAFKQLFLVIKGISGTIFAVVTLNHYHWSLTVATLFSLMIM 
LLVPKIFASKMREVSLNLTNQNEAFLKSSETILNGFDVLASLNLLYVLPKKIKEAGILLKMVIQRKTTVETLAGAISFFLNI 
FFQISLVFLTGYLAIKGIVKIGTIEAIGALTGVIFTALGELGGQLSSIIGTKPIFLKLYSINPIESNKMNDIEPNEVNRDFPLYE 
AKNICYKYGDKEILKNLNFCFQRNEKYLILGESGSGKSTLLKLLNGFLRDYSGELRFCGDDIKKTSYLNMVSNVLYVDQ 
KAYLFEGTIRDNILLEENYTDEEILQSLEQVGLSVKDFPNNILDYYVGDDGRLLSGGQKQKITLARGLIRNKKIVLIDEGT 
SAIDRRTSLAIERKILDREDLTVIIVTHAPHPELKQYFTKIYQFPKDFIZ 

1D9 705 bp 

ATAACAGTTAAACAGATTATGGACGAAATAGCCGTTTCAGATATGACTGCAAGGCGCTATTTACAGGAATTAGCT 

GATAAAGATTTGCTGATTCGTGTGCATGGTGGAGCTGAAAAACTTCGAACCAACTCCCTTTTGACT 

CAAATATTGAAAAACAAGCCCTCCAAACGGCAGAAAAACAAGAAATAGCCCATTTTGCAGGCAGTCTAGTAGAA 

GAAAGAGAAACTATTTTCATTGGACCAGGAACAACATTAGAGTnilUGCGCGTGAGTTGCCT 

GCGTCGTAACCAACAGTCTACCTGTTTTTCTGATTTTAAGCG^ 

AAATT ATCGCGATATTACAGGTGCTTTTGTTGGTACATTGACCCTACAAAATCTCrCTAATCT 
GCTTTCGTTAGCTGTAATGGTATTCAAAACGGAGCTCTAGCTACTTTTAGCGAGGAAG 
ATCGCTTTAAATAATTCTAATAAAAAATATTTACTCGCAGATCATAGCAAGTTCAATAAGTTTGA!^ 
TTATAATGTATCAAATCTTGATACTATTGTTTCAG ATTCTAA ACTAAGTG ATTCAATCC IT ' ITl'AAGCTATCTAAAC 
ACATTAAAGTCATCAAGCCTTAA 

ITVKQIMDEIAVSDMTARRYLQELADKDLLIRVHGGAEKLRTNSLLTNERSNIEKQALQTAEKQEIAHFAGSLVEERETI 

FIGPGTTLEFFARELPIDNIRVVTNSLPVFLILSERKLTDLILIGGNYRDITGAFVGTLTLQNLSNLOFSKAFVSCNGIQNGA 

LATFSEEEGEAQRIALNNSNKKYLLADHSKFNKFDFYTFYNVSNLDTIVSDSKLSDSILFKLSKHIKVIKPZ 

ID10 483 bp 

ATGACTGAGTTTTCGTTAGATCTTCTTCTAGAAGCCATTAAACTAGCTCGTTGGACCTACT 

AGCTAGACAA AACAGATAAAGACCAAG AGCTT AAAACTGAAATTCAATCCATCTTTATCGAACACAAGGG AAATT 

ATGCTTATCGCCGGGTTCATTTAGAACTAAGAAATCGTGGTTATCTGGTAAATCATAAAAGAGTTCAAGGCTTGaT 

GAAAGTACTCAATTTACAAGCTAAAATGCGAAAGAAACGAAAATATTCTTCTCATAAAGGAGACGTTGGTAAGAA 

GGCAGAGAATCTCATTCAAGCCCAATTTGAAGGCTCTAAAACAATGGAAAAGTGCTACACAGATGTGACTGAATT 

TGCCATTCCAGCAAGTACTCAAAAGCTTTACTTATCACCAGTTTTAGATGGCTTTAACAGCGAAATT 

AATCTTI C n GTTCGCCTAATTTAG AATAA 

MTEFSLDLLLEAIKLARWTYYYHLKQLDKTDKDOELKTEIQSIFIEHKGNYAYRRVHLELRNRGYLVNHKRVOGLMK 
VLNLQAKMRKKRKYSSHKGDVGKKAENLIQAQFEGSKTMEKCYTDVTEFAIPASTQKLYLSPVLDGFNSEIIAFNLSCS 
PNLEZ 

ID14 1266 bp 

CCAGGATTTGGTACCGTTGCAAGTGGTGTGCCTTTCCTCCTAAAGGAAAATGGAGGAAAAATCAATCAATCAGCA 

CA TTCA GA TATC AAAGTTGCTAAGGTATTGGTCAAGGATGAAGATGAAAAAAATCGCTTGCTTGCAGCAGGGAAT 

GACTTTAACTTTGTAACCAATGTGGATGAT ATTTT ATCAGACCAGGATATTACTATCGTAGTGGAATTGATGGGGC 

GTATTGAGCCTGCTAAAACCTTTATCACrCGTGCCTTGGAAGCTGGAAAACACGTTGTTACTGCTAACAAGGACCT 

TTTAGCrGTCCATGGCGCAGAATTGCTAGAAATCGCTCAAGCTAACAAGGTAGCACTTTACTACGAAGCAGCAGT 

TGCTGGTGGGATTCCAATTCITCGTACTTTAGCAAATTCCTTGGCTTCTC 

GTCAACGGAACTTCCAACTTCATGGTGACCAAGATGGTGGAAGAAGGCTGGTCTTACGATGATGCTCTTGCGGAA 

G CACA ACGTCTAGGATTTGCAGAAAGCGATCCGACGAATGACGTAGATGGGATTGATGCAGCCTACAAGATGGTT 

ATTTTGAGCCAATTTGCCTTTGGCATGAAGATTGCCTTTGATGATGTAGCCCACAAGGGAATCCGCAATATCACAC 

CAGAAGACGTAGCTGTAGCTCAAGAGCTTGGTTACGTAGTGAAATTGGTTGGTTCTATTGAGGAAACTTCTTCAGG 

TA TTGC TGCAGAAGTGACTCCAACCTTCCTACCTAAAGCGCACCCACTTGCTAGTGTGAATGGCGTAATGAACGCT 

GTCrTTGTAGAATCTATCGGTATTGGTGAGTCTATGTACTACGGACCAGGTGCGGGTCAAAAACCAACTGCAACA 
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AGTGTTGTAGCTGATATTCTCCGTATCGTTCGTCGTTTGAATGATGGTACTATTGGCAAAGACTTCAACGAATATA 
GCCGTGACrTGGTCTTGGCAAATCCTGAAGATGTCAAAGCAAACTA 

AGGTCAGGTCTTGAAGTTGGCTGAAATCTTCAATGCrCAAGATATTTCCTTTAAGCAAATC 

GAGGGTGACAAGGCGCGTGTCGTTATCATCACACACAAGATTAATAAAGCCCAGCTTGAAAATGTCTCAGCTGAA 
TTGAAGAAGGTTTCAGAATTCGACCTCTTGAATACCTTCAAGGTGCTAGGAGAATAA 

PGFGTVASGVPFLLKENGGKINQSAHSDIKVAXVLVKDEDEKNRIXAAGNDFN^ 

AKTFITRALEAGKHVVTAr4KDLLAVHGAELLElAQANKVALYYEAAVAGGIPILRTLANSLASDKITRVLGVVNGTSNF 
M\n*KMVEEGWSYDDALAEAQRLGFAESDPTNDVIX3IDAAYKMVIl^QFAFGMKlAFDDVAHKGIRNITPEDVAVAQE 
LGYVVKLVGSIEETSSGIAAEVTPTFLPKAHPLASVNGVMNAVFVESIGIGESMYYGPGAGQKPTATSVVADIVR1VRRL 
NDGTIGKDFNEYSRDLVLANPEDVKANYYFSILALDSKGQVLKLAEIFNAQDISFKQILQDGKEGDKARVVIITHKINKA 
QLENVSAELKKVSEFDLLNTFKVLGEZ 

ID16 1725 bp 

ATGAAACACCTATTATCTTACTTCAAACCCTACATCAAGGAATCAATTTTAGCCCCCTTGTTCAAGCT 

CTGTTTTTGAGCTCTTGGTTCCCATGGTGATTGCTGGGATTGTTGACCAATCTTTACCTCAGGG 

TCTCTGGATGCAGATTGGCCTGCTCCTTATCTTTGCAGTAATT 

CAGCAAAGGCAGCAGTAGGTTCTGCTAAGGAATTGACAAACGATCTTTATCGTCATATTCTTTCCTTGCCCAAGGA 
CAGCAGAGACCGTCTGACAACnTCrAGTTTGGTCACTCGCTTGACTTCGGATACCTACCAGATTCAGACT^ 
AATCA ATTCCTGCGTCTC 1 I I 1 I ACGAGCGCCCATTATCG'l'ri'rrGGTGCCArrri'rATGGCTTATCGAATCTCAGC 
TGAGTTGACTTTCTGGTTCTTAGTCTTGGTTGCCATTTTGACCATTGTCATTGTAGGGT^ 

CTTTCTACAGTAGTCTCAGAAAGAAAACGGACCAACTGGTTCAGGAAACGCGCCAGCAATTGCAAGGGATGCGGG 
TTATTCGTGCTTTTGGTCAAGAAAAACGAGAGTTACAGATTTTTCAAACCCTTAACC 

AGAAAAGACAGGTTTCTGGTCTAGTTTATTAACACCTCTGACCTATCTGATTGTCAATGGAACTCTTCTCGTTA 

ATCTGGCAAGGCTATATTTCAATTCAAGGAGGAGTGCTCAGTCAAGGTGCTCTCATTGCTCTTATCAATT 

TACAGATTTTGGTGGAATTGGTCAAGCTAGCCATGTTGATCAATTCCCrCAACCAGTCCTATATCTCAGTCAAGCG 

AATCGAGGAAGTCTTTGTTGAGGCTCCAGAGGATATCCATTCAGAGTTAGAACAAAAGCAAGCTACCAGAGATAA 

GGTITTACAAGTCCAAGAATTGACCTTTACCTATCCTGATGCGGCCCAG 

ATGACTCAAGGACAAATTCTAGGTATCATCGGGGGAACTGGTTCTGGTAAATCAAGCTTGGTGCAACTCTTACTTG 

GACTTTATCCAGTAGACAAGGGGAACATTGACCTTTATCAAAATGGACGTAGTCCTCTTAATTTGGAGCAGTGGC 

GGTCTTGGATTGCCTATGTACCTCAAAAGGTCGAACTCTTTAAAGGAACCATTCGTTCCAACT^ 

CAATCAAGAAGTATCTGACCAGGAACTCTGGCAGGCCTTGGAGATTGCGCAAGCTAAGGATTTTGTCAGTGAAAA 

GGAAGGACTCTTGGATGCTCTAGTTGAGGCAGGGGGGCGAAATTTCTCAGGTGGACAAAAACAAAGATTGTCTAT 

CGCCCGAGCAGTCTTGCGCCAGGCTCCGTTTCTCATCCTAGATGATGCAACCTCGGCACTGGATACCATTACAGAG 

TCCAAGCrCTTGAAAGCTATTAGAGAAAATTTTCCAAACACGAGCTTAATTTTGATCTCTC^ 

TACAGATGGCGGACCAGATTCTCCTCrrGGAAAAAGGTGAGTTGCTAGCTGTTGGCAAGCACGATGACTTGATGA 

AATCCAGCCAAGTCTATTGTGAAATCAATGCATCCCAACATGGAAAGGAGGACTAG 

MKHLI^YFKPYIKESILAPLFKLLEAVFELLVPMVIAGIVDQSLPOGDQGHLWMQIGLLLIFAVIGVLVALIAQFYSAKA 

AVGSAKELTNDLYRHILSLPKDSRDRLTTSSLVTRLTSDTYQIQTGINQFLRLFLRAPIIVFGAIFMAYRISAELTFWFLVL 

VAILTIVTVGLSRLVNPFYSSLRKKTDOLVQETRQQLQGMRVIRAFGQEKRELQIFQTLNQVYARLQEKTGFWSSLLTPL 

TYLIVNGTLLVIIWQGYISIOGGVLSQGALIALINYLLOILVELVKLAMLINSLNQSYISVKRIEEVFVEAPEDIHSELEQKQ 

ATRDKVLQVOELTFTYPDAAQPSLRYISFDMTQGQILGIIGGTGSGKSSLVQLLLGLYPVDKGNIDLYQNGRSPLNLEQ 

WRSWIAYVPQKVELFKGTIRSNLTLGFNQEVSEXJELWQALEIAQAKDFVSEKEGLLDALVEAGGRNFSGGQKQRLSIA 

RAVLRQAPFLILDDATSALDTITESKLLKA1RENFPNTSLILIS0RTSTLQMADQILLLEKGELLAVGKHDDLMKSSQVYC 

EINASQHGKEDZ 

ID18 1224 bo 

ATG AA ACGTTCTCTCG ACTC A AG AGTCG ATT AC AGTTTGCTCTTG CC AGT A 1 ' ITT1 ' 1 CT ACTGGTC ATCGGTGTGGT 

GGCTATCTATATAGCCGTTA GTCA TGATTATCCCAATAATATTCTGCCCATTTTAGGGCAGCAGGTCGCCTGGATT 

GCCTTGGGGCTTGTGATTGGTTTTGTGGTCATGCTCTTTAATACAGAATTTCTTTGGAAGGTGAC 

TATITTAGGCTTGGGACTTATGATCTTGCCGATTGTATTTTATAATCCAAGCTTAGTTGCATCAACGGGTGCCAAA 

AACTGGGTATCAATAAATGGAATTACCCTATTCCAACCGTCAGAATTTATGAAGATATCCTATATCCTCATGTTGG 

CTCGTGTCATTGTCCAATTTACAAAGAAACATAAGGA^TGGAGACGCACGGTTCCGCTGGACTTTTTGT^ 

CTGGATGATTCTCTTTACCATTCCAGTCCTAGTTCTTTTAGCACTTCAAAGTGACTTGGGGA 

TAGCCATTTTCTCAGGAATCGTTTTATTATCAGGGGTTTCTTGGAAAATTATTATCCCAGTATTTGTG 

ACAGGAGTTGCTGGTTTC TTAG CTATCTTTATTAGCAAGGACGGACGAGCl'1'ri'C 1TCACCAGATTGGAATGCCGA 

CCTACCAAATTAATCGGATTTTGGCTTGGCTCAATCCCrTTGAGTTTGCCCAAACA^ 

AGGGCAGATTGCCATTGGGAGTGGTGGCTTATTTGGTCAGGGATTTAATGCTTCGAATCTGCTTATCCCAGTTCGA 

GAGTCAGATATGATTTTTACGGTTATTGCAGAAGATTTTGGCTTTATTGGCTCTGTCCTGGTTATT 

CATGTTGATTTACCGTATGTTGAAGATTACTCTTAAATCAAATAACCAGTTCTACACTTATATTTCCACAGGTTTGA 

T TATG ATGTTGCTCTTCCACATCTTTGAGAATATCGGTGCTGTGACTGGACTACTTCCTITGACGG 

CCTTTCATTTCGCAAGGGGGATCAGCTATTATCAGTAATCTGATTGGTGTTGGTTTGCTTTTATCGATGAGTTACCA 
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GACTAATCTAGCTGAAGAAAAGAGCGGAAAAGTCCCATTCAAACGGAAAAAGGTTGTATTAAAACAAATTAAATA 
A 

MKRSLDSRVDYSLLLPVFFLLVIGVVAIYIAVSHDYPNNILPILGQQVAWIAJ-GLVIGFVVMLFNTEFLWKVTPFLYILGL 
5 GLMILPIVFYNPSLVASTGAKNWVSINGrTLFQPSEFMKISYILMLARVIVQFTIOCHKEWRRTVPLDFLLIFWMILFTIPVL 
VLLALOSDLGTALVFVAIFSGIVLLSGVSWKIIIPVFVTAVTGVAGFLAIFISKDGRAFLHOIGMPTYQINRILAWLNPFEF 
AQTTTYQOAQGQIAIGSGGLFGOGFNASNLLIPVRESDMIFTVIAEDFGFIGSVLVIALYLMLIYRMLKITLKSNNOFYTY 
ISTGLIMMLLFHIFENIGAVTGLLPLTGIPLPFISQGGSA1ISNLIGVGLLLSMSYQTNLAEEKSGKVPFKRKKVVLKQ1KZ 

10 1D22 987 bp 

ATGGTGGCTAAGAAAAAAATCTTATTTTTTATGTGGTCTn 

CCATTGTTTCAAATCTGGATCCAGAAAAGTATGATATTGATATTCTTGAAATGGAGCACTTTGACAAGGGATATGA 
ATCTGTTCCAAAGCATGTACGCATTTTAAAATCCCTTCAAGATTATCGCCAAACCAGATGGTTACGAGCT' ri ^'l^' G 
TGGAGAATGAGAATTTATTTTCCAAGACTGACTCGTCGTTTGCTTGTAAAAGATGATTATGATGTTG 
TTACCATTATGAATCCACCACTGTTGTTCTCTAAAAGAAGAGAAGTCAAGAAGATATCTTGGATTCATGGAAGTAT 
TGAAGAACTTCTTAAGGATAGCTCTAAAAGAGAATCACATAGAAGCCAGTTGGATGCTGCGAATACAATTGTAGG 
GATTTCAAAAAAGACCAGCAATTCTATCAAGGAAGTTTATCCAGATTATACTTCTAAATTACAGACAATCTACAAT 
GGATATGATTTTCAGACTATTCTAGAAAAATCTCAAGAGAAGATCGATATCGAGATTGCTCCTCAAAGTATCTGTA 
CTATCGGACGGATTGAGGAAAATAAGGGTTCTGACCGTGTAGTGGAAGTGATACGATTATTACACCAAGAGGGAA 
AAAACTATCATCTCTATTTTATCGGGGCTGGTGATAT^ 

TTGAGGACTATGTACATTTCCTTGGTTATCAAAAAAATCCTTATCAGTATCTATCTCAGACGAAAGTTCTTTTGTCT 
ATGTCTAAACAAGAAGGTTTTCCTGGAGTGTATGTGGAGGCCTTGAGTCTGGGACTCCCTTTTATCTCTACGGACG 
TTGGAGGGGCTGAGGAATTATCCCAAGAAGGACGATTTGGACAAATCATTGAGAGCAATCAAGAGGCAGCTCAG 
GCGATTACTAATTACATGACTTCTGCCTCAAACTTTGATGTCGATGAGGCTAGCCAATTCATTCAACAATTTACAA 
TTACAAAACAAATCGAACAAGTAGAAAAACTATTAGAGGAGTAG 

MVAKKKILFFMWSFSLGGGAEKII^TrVSNLDPEKYDIDILEMEHFDKGYESVPKHVRILKSLQDYRQTRWLRAFLWRM 
RIYFPRLTRRLLVKDDYDVEVSFTIMNPPLLFSKRREVKKISWIHGSIEELLKDSSKRESHRSQLDAANTrVGISKKTSNSIK 
30 EVYPDYTSKLQTIYNGYDFQTILEKSQEKIDIEIAPQSICTIGRIEENKGSDRVVEVIRLLHQEGKNYHLYFIGAGDMEEEL 
KKRVKEYGIEDYVHFLGYQKNPYQYLSQTKVLLSMSKQEGFPGVYVEALSLGLPFISTDVGGAEELSQEGRFGQIIESNQ 
EAAQAITNYMTSASNFDVDEASQFIQQFTITKQIEQVEKLLEEZ 

ID23 1434 bp 

ATGGAAACTGCATTAATTAGTGTGATTGTGCCAGTCTATAATGTGGCGCAGTACCTAGAAAAATCGATAGCTTCCA 
TTCAGAAGCAGACCTATCAAAATCTGGAAATTATTCTTGTTGATGATGGTGCAACAGATGAAAGTGGTCGCTTGTG 
TGATTCAATCGCTGAACAAGATGACAGGGTGTCAGTGCTTCATAAAAAGAACGAAGGATTGTCGCAAGCACGAAA 
TGATGGGATGAAGCAGGCTCACGGGGATTATCTGATTTTTATTGACTCAGATGATTATATCCATCCAGAAATGATT 
CAGAGCTTATATGAGCAATTAGTTCAAGAAGATGCGGATGTTTCGAGCTGTGGTGTCATGAATGTCTATGCTAATG 
ATGAAAGCCCACAGTCAGCCAATCAGGATGACTATTTTGTCTGTGATTCTCAAACATTTCTAAAGGAATACCTCAT 
AGGTGAAAAAATACCrGGGACGATTTGCAATAAGCTAATCAAGAGACAGATTGCAACTGCCCTATCCTTTCCTAA 
GGGGTTGATTTACGAAGATGCCTATTACCATTTTGATTTAATCAAGTTGGCCAAGAAGTATGTGGTTAATACTAAA 
CCCTATTATTACTA TTTCC ATAGAGGGGATAGTATTACGACCAAACCCTATGCAGAGAAGGATTTAGCCTATATTG 
ATATCTACCAA AAGTT TTATAATGAAGTTGTGAAAAACTATCCTGACTTGAAAGAGGTCG C1 11 II 1 CAGATTGGC 
CTATG CCCACr i'CMUM'ATTCTGGATAAGATGTTGCTAGATGATCAGTATAAACAGTTTGAAGCCTATTCTCAGATT 
CATCGTTTTTTAAAAGGCCATGCCTTTGCTATITCTAGGAATCCAATTTTCCGTAAGGGG 

TGGCCCTATTCATAAATATTTCCTTATATCGATTCITATTACTGAAAAATATTGAAAAATCTAAAAAATTACATTA 
G 



METAL1SVIVPVYNVAQYLEKSIASIQKQTYQNLEIILVDDGATDESGRLCDSIAEODDRVSVLHKKNEGLSOARNDGM 
KOAHGDYLIFIDSDDYIHPEMIQSLYEQLVQEDADVSSCGVMNVYANDESPQSANODDYFVCDSOTFLKEYLIGEKIPG 
TICNKLIKRQIATALSFPKGLIYEDAYYHFDLIKLAKKYVVNTKPYYYYFHRGDSITTKPYAEKDLAYIDIYQKFYNEVV 

KNYPDLKEVAFFRLAYAHFFILDKMLLDDQYKQFEAYSQIHRFLKGHAFAISRNPIFRKGRRISALALF1NISLYRFLLLK 
NIEKSKKLHZ 



ID24 735bp 

60 ATGAGAATCAAAGAGA AAACC AATAATATTAATGGAGGAATAAAAAATGTAAGTAAGCATTATGGTCATTCAATC 
ATTCTCAAAGATATAAATTTTGCACTTAACAAGGGTGAAATTGTTGGTCTAGCAGGGAGAAATGGAGTTGGTAAG 
AGTACGTTGATGAAAATTCTTGTTCAGAATAATCAACCGACTTCAGGTAATATTATAAGCAGTGATAATGTTGGGT 
ATTTAATCGAAGAACCAAAATTATTTTTATCTAAAACAGGTTTAGAGAATTTAAAATATTTGTCAAATTTATATGG 
TGTTGACTAC AATC AAGAAAGATTTAGATGTTTGATCCAAGAGTTAGATTTGACTCAGTCTATTAATAAAAAAGTA 

CO AAGACCTATTCTTTGGGTACAAAACAAAAATTAGCTTTGCTTCTA 
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TAGATGAACCGACTAATGGTTTAGATATTGAATCATCACAAATAGTTTTAGCGGTTCTAAAAAAATTAGCTTT 
TGAAAATGTGGGAATTTTAATATCGAGTCATAAATTAGAAGACATTGAAGAAATTTGTGAGAGAGTTCTTTTCTTG 
GAGAACGGGCTTTTGACATTTCAAAAAGTAGGAAAAGATAGTCATAATTTCT^ 
CTACAGATAGAGACATTTTCATTACCAAACAAGAATTTTGGGATATTGTTTAG 

5 

MRIKBKTNNINGGIKNVSKHYGHSIILKDINFALNKGEIVGLAGRNGVGKSTLMKILVQNNQPTSGNHSSDNVGYLIEEP 

KLFLSKTGLENLKYLSNLYGVDYNQERFRCLIQELDLTQSINKKVKTYSLGTKQKLALLLTLVTEPDILILDEPTNGLDIE 

SSQIVLAVLKKLALHENVGILISSHKLEDIEEICERVLFLENGLLTFQKVGKDSHNFLFEIAFSSATDRDIFITKQEFWDIVZ 

10 ID25 1704bp 

ATGACTGAATTAGATAAACGTCACCGCAGTAGCATTTATGACAGCATGGTTAAATCACCTAACCGTGCTATGCTTC 
GTGCGACTGGTATGACAGATAAGGACTTTGAAACATCGATTGTGGGAGTGATTTCGACTTGGGCGGAAAATACAC 
CATGTAACATTCACTTGCATGATTTCGGGAAACTGGCTAAAGAAGGTGTCAAATCTGCAGGCGCTTGGCCTGTAC 
15 AGTTTGGAACCATTACCGTAGCGGACGGGATCGCTATGGGAACGCCTGGTATGCGTTTCTCTCTAACATCTCGTGA 
CATCATCGCGGACTCCATCGAGGCGGCTATGAGTGGTCACAACGTGGATGCCTTCGTCGCTATCGGTGGCTGTGA 
GAAGAACATGCCTGGATCTATGATTGCTATTGCTAATATGGATATC 

GCACCGGGAAATCTTGATGGTAAAGATATCGACTTGGTTTCTGTCTTTGAAGGTATCGGAAAATGGAACCACGGT 
GACATGACAGCTGAGGACGTGAAACGTCTTGAATGTAATGCCTGCCCTGGCCCTGGTGGTTGTGGTGGTATGTAT 

20 ACTGCTAATACCATGGCAACTGCTATCGAAGTTCTAGGGATGAGTTTGCCAGGGTCATCCTCTCACCCAGCTGAAT 
CAGCTGATAAGAAAGAAGATATCGAAGCAGCAGGACGTGCTGTTGTTAAGATGTTGGAACTTGGTCTCAAACCAT 
CAGATATCrTGACTCGTGAAGCCTTTGAAGATGCTATCACTGTAACGATGGCTCTCGGTGGTTCTACAAACGCCAC 
TCTTCACTTGCTCGCCATTGCCCATGCCGCAAATGTTGACTTGTCACTTGAGGACTTCAATACGATTCAAGAACGT 
GTGCCTCACTTGGCCGACTTGAAACCATCTGGTCAGTATGTCTTCCAAGACCTCTACGAAGTCGGTGGTGTCCCTG 

25 CGGTTATGAAGTATTTGTTGGCAAATGGTTTCCTTCACGGAGATCGCATCACATGTACTGGTAAGACTGTAGCTGA 
AAA(nTGGCTGACTTTGCAGACTTGACTCCAGGCCAAAAAGTTATCATGCCACTTG 

TGGTCCGCTTATCATCTTGAACGGGAACCITGCrrCCTGACGGTGCAGTTGCCAAGGTATCAGGTGTTAAAGTGCGT 
CGTCACGTTGGGCCAGCTAAGGTCTTTGACTCAGAAGAAGATGCGATTCAGGCCGTTCTGACAGATGAAATCGTT 
GATGGCGATGTAGTCGTTGTTCGTTTTGTTGGACCTAAAGGTG 
30 - AATGATTGTTGGTAAAGGTCAGGGAGATAAGGTGGCCCTCTTGACGGACGG 

CTGGTTGTTGGACATATCGCTCCTGAAGCTCAGGATGGTGGACCAATTGCCTATCTCCGTACCGGCGATATCGTTA 
CGGTTGACCAAGATACCAAAGAAATTTCTATGGCCGTATCCGAAGAAGAACTTGAAAAACGCAAGGCAGAAACA 
ACCTTGCCACCACTTTACAGCCGTGGTGTCCTCGGTAAATATGCCCACATCGTATCATCTGCTTCACGCGGAGCCG 
TGACAGACTTCTGGAATATGGACAAGTCAGGTAAAAAATAA 

35 

MTELDKRHRSSIYDSMVKSPNRAMLRATGMTDKDFETSIVGVISTWAENTPCNIHLHDFGKLAKEGVKSAGAWPVQFG 
TITVADGIAMGTPGMRFSLTSRDIIADSIEAAMSGHNVDAFVAIGGCDKNMPGSMIAIANMDIPAIFAYGGTIAPGNLDG 
KDIDLVSVFEGIGKWNHGDMTAEDVKRLECNACPGPGGCGGMYTANTMATAIEVLGMSLPGSSSHPAESADKKEDIE 
AAGRAVVKMLELGLKPSDILTREAFEDAITVTMALGGSTNATLHLLAIAHAANVDLSLEDFNTIOERVPHLADLKPSGQ 
40 YVFQDLYEVGGVPAVMKYLLANGFLHGDRITCTGKTVAENLADFADLTPGQKVIMPLENPKRADGPLHLNGNLAPDG 
AVAKVSGVKVRRHVGPAKVFDSEEDAIQAVLTDEIVDGDVVVVRFVGPKGGPGMPEMLSLSSMIVGKGQGDKVALLT 
DGRFSGGTYGLVVGHIAPEAQDGGP1AYLRTGDIVTVDQDTKEISMAVSEEELEKRKAETTLPPLYSRGVLGKYAHIVSS 
ASRGA VTDFWNMDKSGKKZ 

45 ID26 274bo 

ATGTTATAATAAAAATAAAGAATTTAAGGAGAAATACAATATGTCAATTTTTATTGGAGGAGCATGGCCATATGC 
AAACGGTTCGTTACATATTGGTCACGCGGCAGCGCTTTTACCGGGGGATATTCTTGCAAGATACTATCGTCAGAA 
GGGAGAGGAAGTTTTATATGTTTCTGGAAGTGATTGTAATGGAACCCCTATTTCTATCAGAGCTAAAAAAGAAAA 
50 TAAGTCTGTGAAAGAAATTGCTGATTTTTATCATAAGGAATTTAATCCA 

CYNKNKEFKEKYNMSIFIGGAWPYANGSLHIGHAAALLPGDILARYYRQKGEEVLYVSGSDCNGTPISIRAKKENKSVK 
EIADFYHKEFNP 

55 ID28 1065bp 

ATGACAACATTATTTTCAAAAATTAAAGAAGTAACAGAACTTGCTGCAGTCTCAGGTCATGAAGCGCCTGTCCGT 

GCTTATCTTCGTGAAAAGTTGACACCGCATGTGGATGAAGTGGTGACAGATGGCrrGGGTGGTATTTTTGGTATCA 

AACATTCAGAAGCTGTGGATGCACCGCGCGTCTTGGTCGCnTCTCATATGGACGAAGTTGGTTTTATGGTCAGCGA 

60 AATCAAGCCAGATGGTACCTTCCGTGTCGTAGAAATCGGTGGCTGGAACCCCATGGTGGTTAGCAGCCAACGTTT 
CAAACTCTTGACTCGTGATGGTCATGAAATTCCTGTGATTTCAGGTTCTGTTCCTCCGCATTTGACT 
GGGGGACCAACCATGCCAGCCATTGCCGATATCGTTTTTGATGGTGGTTTTGCGGACAAGGCTGAGGCAGAAAGT 
TTTGGCATCCGTCCTGGTGATACCATTGTACCAGATAGTTCTGCAATTTTGACAGCCAATGAAAAAAATATCATCT 
CAAAAGCTTGGGATAACCGCTACGGTGTCCTCATGGTAAGCGAGCTAGCTGAAGCTTTATCGGGTCAAAAACTCG 

65 GCAATGAACTCTATCTGGGTTCTAACGTCCAAGAAGAAGTTGGTCTGCGTGGCGCTCATACCTCTACAACCAAGTT 
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TGACCCAGAAGTCTTCCTCGCAGTTGATTGCTCACCAGCAGGTGATGTCTACGGTGGTCAAGGCAAGATTGGAGA 
TGGAACCTTGATTCGTTTCTATGATCCAGGTCACTTC 

GAAGAAGCTGGTATCAAGTACCAATACTACTGTGGTAAAGGCGGAACAGATGCAGGTGCAGCTCATCTGAAAAAT 
GGTGGTGTCCCATCAACAACTATCGGTGTCTGCGCTCGTTATATCCATTCTCACCAAACCCTCTATGCAATGGATG 
5 ACTTCCTAGAAGCGCAAGCTTTCTTACAAGCCTTGGTGAAGAAATTGGATCGTTCAACGGTTGATTTGATTA 
TTATTAA 

MTTLFSKIKEVTELAAVSGHEAPVRAYLREKLTPHVDEVVTDGLGGIFGIKHSEAVDAPRVLVASHMDEVGFMVSEIKP 
DGTFRVVEIGGWNPMVVSSQRFKLLTRDGHEIPVISGSVPPHLTRGKGGPTMPAIADIVFDGGFADKAEAESFGIRPGDT 
10 IVPDSSAILTANEKNHSKAWDNRYGVLMVSELAEALSGQKLGNELYLGS^A^QEEVGLRGAHTSTTKFDPEVFLAVDCS 
PAGDVYGGQGKIGDGTLIRFYDPGHLLLPGMKDFLLTTAEEAGIKYQYYCGKGGTDAGAAHLKNGGVPSTTIGVCARY 
IHSHQTLYAMDDFLEAQAFLQALVKKLDRSTVDLIKHY2 

ID31 1182bp 

ATGG A ATTTTCTATG AA ATC AGTCA A AGG ACTACTCTTT ATC ATA Vl ' l ' l G ACTTGG AT 

GAACACTTCTCCCCAATTCATGATTCCAGGACTAGCTTTAACAAGCCTATCTCTGACTTTTATCCTAG 
CTCCCACTACTAGAAAGCTGGTTTCACAGTTTGGAGAAGGTCTACACCGTCCACAAATTCACAGCCTTTCTCTCAA 
TCATCCTACTAATCTTTCATAACITTAGTATGGGCGGTTTGTGGGGCTCrCGCTT 
20 GCCATCTATATCTTTGCCAGCATCATCCnTGTCGCCTATTTAGGCAAATACATCCAATACGAAGCTTGGCGATGGA 
TTCA CCGCCTGG TTTACCTAGCCTATATTTTAGGACT.CT^CACATCTA^ 

TTTAATCTTCTAAG'1'n'l Cl'l GTTGGTAGCTATGCCCITTTAGGCTTACTAGCTGGTTT UJ ATATCATTTTTCTATAT 

CAAAAGATTTCCTTCCCCTATCTAGGGAAAATTACCCATCTCAAACGCTTAAATCACGATACTAGAGAAATTCAA 

ATCCATCTTAGCAGACCTTTCAACTATCAATCAGGACAATTTGCCITTCTAAAGATTT^ 

25 GTGCTCCGCATCCCTTTTCTATCTCAGGAGGTCATGGTCAuAACTCTTTACTTTACTGTTAAAACT 

TACCAAGAATATCTATGATAATCTTCAAGCCGGCAGCAAAGTAACCCTAGACAGAGCTTACGGACACATGATCAT 
AGAAGAA GGACG AGAAAATCAGGTTTGGATTGCTGGAGGTATTGGGATCACCCCCTTCATCTCTTACATCCGTGA 
ACATCCTATTTTAGATAAACAGGTTCACTTCTACTATAGCTTCCGTGGAGATGAAAATGCAGTCTACCT 
CTCCGTAACTATGCTCAGAAAAATCCTAATTTTGAACTCCATCTAATCGACAGTACGAAAGACGGCTATCTT^ 

30 TTGAACAAAAAGAAGTGCCCGAACATGCAACCGTCTATATGTGTGGTCCTATTTCTATGATGAAGGCACTTGCCA 
AACAGATTAAGAAACAAAATCCAAAAACAGAGCATATTTAC 

MEFSMKSVKGLLFIIASFILTLLTWMNTSPQFMIPGLALTSLSLTnLATRLPLLESWFHSLEKVYTVHKFTAFLSIILLIFH 
NFSMGGLWGSRLAAQFGNLAIYIFASIILVAYLGKYIQYEAWRWIHRLVYLAYILGLFHIYM1MGNRLLTFNLLSFLVGS 
35 YALLGLLAGFYIIFLYQKISFPYLGKITHLKRLNHDTREIQIHLSRPFNYQSGQFAFLKIFQEGFESAPHPFS1SGGHGQTLY 
FTVKTSGDHTKNIYDNLQAGSKVTLDRAYGHMIIEEGRENQVWIAGGIGrrPFISYIREHPILDKQVHFYYSFRGDENAV 
YLDLLRNYAQKNPNFELHLIDSTKDGYLNFEQKEVPEHATVYMCGPISMMKALAKQIKKQNPKTEHIY 

1D32 900bp 

40 

ATGACTTTTAAATCAGGCTTTGTAGCCATTTTAGGACGTCCCAATGTTGGGAAGTC^ 

TGGGGCAAAAGATTGC CATC ATGAGTGACAAGGCGCAGACAACGCGCAATAAAATCATGGGAATTTACACGACTG 

ATAAGGAGCAAATTGTCTTTATCGACACACCAGGGATTCACAAGCCTAAAACAGCTCTCGGAGATTTCATGGTTG 

AGTCTGCCTACAGTACCCTTCGCGAAGTGGACACTGTTCTTTTCATGGTGCCTGCTGATGAAGCGCGTGGTAAGGG 

45 GGACGATATGATTATCGAGCGTCTCAAGGCTGCCAAGGTTCCTGTGATTTTGGTGGTGAATAAAATCGATAAGGTC 
CATCCAGACCAGCTCTTGTCTCAGATTGATGACTTCCGTAATCAAATGGACTTTAAGGAAATTGTTCCAATCTCAG 
CCCrrCAGGGAAATAACGTGTCTCGTCTAGTGGATATTTTGAGTGAAAATCTGGATGAAGGTTTCCAATATTTCCC 
GTCTGATCAAATCACAGACCATCCAGAACGTTTCTTGGTTTCAGAAATGGTTCGCGAGAAAGTCTTGCACCTAACT 
CGTGAAGAGATTCCGCATTCTGTAGCAGTAGTTGTTGACTCTATGAAACGAGACGAAGAGACAGACAAGGTTCAC 

50 ATCCGTGCAACCATCATGGTCGAGCGCGATAGCCAAAAAGGGATTATCATCGGTAAAGGTGGCGCTATGCTTAAG 
AAAATCGGTAGCATGGCCCGTCGTGATATCGAACTCATGCTAGGAGACAAGGTCTTCCTAGAAACCTGGGTCAAG 
GTCAAGAAAAACTGGCGCGATAAAAAGCTAGATTTGGCTGACTTTGGCTATAATGAAAGAGAATACTAA 

MTFKSGFVAILGRPNVGKSTFLNHVMGQKIAIMSDKA0TTRNK1MGIYTTDKEQIVFIDTPGIHKPKTALGDFMVESAYS 
55 TLREVDTVLFMVPADEARGKGDDMIIERLKAAKVPVILVVNKIDKVHPDQLLSQIDDFRNQMDFKEIVPISALQGNNVS 
RLVDILSENLDEGFQYFPSDQITDHPERFLVSEMVREKVLHLTREEIPHSVAVVVDSMKRDEETDKVHIRATIMVERDSO 
KG1IIGKGGAMLKKIGSMARRDIELMLGDKVFLETWVKVKKNWRDKKLDLADFGYNEREYZ 

1D33 855bo 

60 

CTGC n C l i GTTTn ACAGAAGG AGG ACTTATGCCTG AATTACCTGAGGTTGAAACCGTTTGTCGTGGCTTAGAAA 
AATTGATTATAGGAAAGAAGATTTCGAGTATAGAAATTCGCTACCCCAAGATGATTAAGACGGATTTGGAAGAGT 
TTCAAAGGGAATTGCCTAGTCAGATTATCGAGTCAATGGGACGTCGTGGAAAATATTTGCTTTTTTATCTGACAGA 
CAAGGT CTTGA TTTCCCATTTGCGGATGGAGGGCAAGTATTTTTACTATCCAGACCAAGGACCTGAACGCAAGCAT 
D5 GCCCATGTTTTCTTTCATTTTGAAGATGGTGGCACGCTTGTTTATGAGGATGTTCGCAAGTTTGGAACCATGGAAC 
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TCTTGGTGCCTGACCTTTTAGACGTCTACTTTATTTCTA 
TTTACAGGTCTTTCAATCTGCCCITGCCAAGTCCAAAAAG 

GCTGGACTTGGCAATATCTATGTGGATGAGGTTCTCTGGCGAGCTCAGGTTCATCCAGCTAGACCTTCCCAGAC^ 

TGACAGCAGAAGAAGCGACTGCCATTCATGACCAGACCATTGCTGTTTTGGG CCAG GCTGTTGAAAAAGGTGGCT 

CCACCATTCGGACTTATACCAATGCCITTGGGGAAGATGGAAGCATGCAGGACTTTCATCAGGTCTAT GATA AGA 

CTGGTCAAGAATGTGTACGCTGTGGTACCATCATTGAGAAAATTCAACTAGGCGGACGTGGAACCCACTTTTGTCC 

AAACTGTCAAAGGAGGGACTGA 

MLLVFTEGGLMPELPEVETVCRGLEKLIIGKKISSIEIRYPKMIKTDLEEFQRELPSQIIESMGRRGKYLLFYLTDKVLISHL 
RMEGKYFYYPDQGPERKHAHVFFHFEDGGTLVYEDVRKFGTMELLVPDLLDVYF1SKKLGPEPSEQDFDLQVFQSALA 
KSKKP1KSHLLDOTLVAGLGNIYVDEVLWRAQVHPARPSQTLTAEEATAIHDQTIAVLGQAVEKGGSTIRTYTNAFGED 
GSMQDFHQVYDKTGQECVRCGTIIEKIQLGGRGTHFCPNCQRRD2 

ID34 633bp 

TTGTCCAAACTGTCAAAGGAGGGACTGATGGGAAAAATCATCGGAATCACTGGGGGAATTGCCTCTGGTAAGTCA 

ACTGTGACAAATTTTCTAAGACAGCAAGGCTTTCAAGTAGTGGATGCCGACGCAGTCGTCCACCAACTACAGAAA 

CCTGGTGGTCGTCTGTTTGAGGCTCTAGTACAGCACTTTGGGCAAGAAATCATTCITGAAj^ACGGA 

GCCCTCTCCTAGCTAGTCTCATCIIM'ICAAATCCTGATGAACGAGAATGGTCTAAGCAAATTCAAGGGGAGATTAT 

CCGTGAGGAACTGGCTACTTTGAGAGAACAGTTGGCTCAGACAGAAGAGA TITI C 1 1 CATGGATATTCCCCTACTT 

TTTGAGCAGGACTACAGCGATTGGTTTGCTGAGACTTGGTTGGTCTATGTGGACCGAGATGCCCAAGTGGAACGC 

TTAATGAAAAGGGACCAGTTGTCCAAAGATGAAGCTGAGTCTCGTCTGGCAGCCCAGTGGCCTTTAGAAAAAAAG 

AAAGATTTGGCCAGCCAGGTTCTTGATAATAATGGCAATCAGAACCAGCTTCTTAATCAAGTGCATATCCl'l'CriG 

AGGGAGGTAGGCAAGATGACAGAGATTAA 

MSKLSKEGLMGKnGITGGIASGKSTVTNFLRQQGFQVVDADAVVHQLQKPGGRLFEALVQHFGOEIILENGELNRPLL 

ASLIFSNPDEREWSKQIQGEI1REELATLREQLAQTEEIFFMDIPLLFEQDYSDWFAETWLVYVDRDAQVERLMKRDQLS 

KDEAESRLAAQWPLEKKKDLASQVLDNNGNQNQLLNQVHILLEGGRQDDRDZ 

ID3S I269bo 

TTGATAATAATGGCAATCAGAACCAGCTTCTTAATCAAGTGCATATCCTTCTTGAGGGAGGTAGGCAAGATGACA 

GAGATTAACTGGAAGGATAATCTGCGCATTGCCTGGTTTGGTAATTTTCTGACAGGAGCCAGTAT11CI1MGGTTG 

TACCIU'IU ATGCCCATCTTCGTGGAAAATCTAGGTGTAGGGAGTCAGCAAGTCGCrrri'l ATGCAGGCTTAGCAAT 

TTCTGTCTCTGCTATTTCCGCGGCGCTC lTl'l CTCCTATTTGGGGTATTCTTGCTGACAAATACGGCCGAAAACCCA 

TGATGATTCGGGCAGGTCTTGCTATGACTATCACTATGGGAGGCTTGGCCTTTGTCCCAAATATCTAT^ 

CTTTCITCGTTTACTAAACGGTGTATTTGCAGGTTTTGTTCCTAATC 

AAGGAGAAATCAGGCTCTGCCTTAGGTACTTTGTCTACAGGCGTAGT^^ 

GTGGCTTTATCGCAGAATTATTTGGCATTCGTACAGn I"I C1U ACrGGTTGGTAGTTTTCTAri'1'11 AGCTGCTATTT 

TGACTATTTGCTTTATCAAGGAAGATTTTCAACCAGTAGCCAAGG 

CTCGGTTAAATATCCCTATCTTTTGCTCAATCTCTTTTTAACCAGTT^ 

GCCCTATTTTGGCTCTTTATGTACGCGACTTAGGGCAGACAGAGAATC1 lCrn"n GTCTCTGGTTTGATTGTGTCC 
AGTATGGGCTTTTCCAGCATGATGAGTGCAGGAGTCATGGGCAAGCTAGGTGACAAGGTGGGCAATCATCGTCTC 
TTGGTTGTCGCCCAGTTTTATTCAGTCATCATCTATCTCCTCT 

CTATCGTTTCCTCTTTGGATTGGGAACCGGTGCCITGATTCCCGGGGTTAATGCCCTACTCAGCA 
AAAGCCGGCATTTCGAGGGTCirTGCCTTCAATCAGGTATTCTTTTATCTGGGAGGTGTTGTTGG 
GTTCTGCAGTAGCAGGTCAATTTGGCTACCATGCTGTCTTTTATGC^ 
TTTAACCTGATTCAATTTCGAACATTATTAAAAGTAAAGGAAATCTAG 

MIIMAIRTSFLIKCISFLREVGKMTEINWKDNLRIAWFGNFLTGASISLVVPFMPIFVENLGVGSQQVAFYAGLAISVSAIS 

AALFSPIWGILADKYGRKPMMIRAGLAMTITMGGLAFVPNIYWLIFLRLLNGVFAGFVPNATAL1ASQVPKEKSGSALG 

TLSTGVVAGTLTGPFIGGFIAELFGIRTVFLLVGSFLFLAAILTICFIKEDFQPVAKEKAIPTKELFTSVKYPYLLLNLFLTS 

FVIQFSAQSIGPILALYVRDLGQTENLLFVSGLIVSSMGFSSMMSAGVMGKLGDKVGNHRLLVVAQFYSVIIYLLCANAS 

SPLQLGLYRFLFGLGTGALIPGVNALLSKMTPKAG1SRVFAFNQVFFYLGGVVGPMAGSAVAGQFGYHAVFYATSLCV 

AFSCLFNLIQFRTLLKVKEIZ 

ID36 131 lbp 

ATGGCCCTACCAACTATTGCCATTGTAGGACGTCCCAATGTTGGGAAATCAACCCTATTTAATCGGATCGCTGGTG 

AGCGAATCTCCATTGTAGAAGATGTCGAAGGAGTGACACGTGACCGTATTTATGCAACGGGTGAGTGGCTCAATC 

GTTCTTTTAGCATGATTGATACAGGAGGAATTGATGATGTCGATGCTCCTTTCATGGAACAAATCAAGCACCAGGC 

AGAAATTGCCATGGAAGAAGCAGATGTTATCGTTTTTGTCGTGTCTGGTAAGGAAGGAATTACTGATGCAGACGA 

ATACGTAGCTCGTAAGCTTTATAAGACCCACAAACCAGTTATCCTCGCAGTCAACAAGGTGGACAACCCTGAGAT 

GAGAAATGATATATATGATTTCTATGCTCTCGGTTTGGGTGAACCATTGCCTATCTCATCTGTCCATGGAATCGGT 

ACAGGGGATGTGCTAGATGCGATCGTAGAAAATCTTCCAAATGAATATGAGGAAGAAAATCCAGATGTCATTAAG 

TTTAGCTTGATTGGTCGTCCTAACGTTGGAAAATCAAGCTTGATCAATGCTATCTTGGGAGAAGACCGTGTTATT 
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CTAGTCCTGTTGCTGGAACAACTCGTGATGCCATTGATACCCACTTTACAGATACAGATGGTCAAGAGTTTACCAT 
GATTGATACGGCTGGTATGCGTAAGTCTGGTAAGGTTTATGAAAATACTGAGAAATACTCTGTTATGCGTGCCATG 
CGTGCTATTGACCGTTCAGATGTGGTCTTGATGGTCATCAATGCGGAAGAAGGCATTCGTGAGTACGACAAGCGT 
ATCGCAGGATTTGCCCATGAAGCTGGTAAAGGGATGATTATCGTGGTCAACAAGTGGGATACGCTTGAAAAAGAT 
5 AACCACA CTAT GAAAAACTGGGAAGAAGATATCCGTGAGCAGTTCCAATACCTGCCTTACGCACCGATTATCTTT 
GTATCAGCTTTAACCAAGCAACGTCTCCACAAACTTCCTGAGATGATTAAGCAAATCAGCGAAAGTCAAAATACA 
CGTATTCCATCAG CTGTC TTGAACGATGTCATCATGGATGCCATTGCCATCAACCCAACACCGACAGACAAAGGA 
AAACGTCTCA AGATT TTCTATGCGACCCAAGTGGCAACCAAACCACCAACCTTTGTCATCTTTGTCAATGAAGAAG 
AACTCATGCACi 1 l TCI rACCTGCGTTTCTTGGAAAATCAAATCCGCAAGGCCTTTGTTTTTGAGGGAACACCGAT 
1 0 TC ATCTC ATCG C A AG A AAA CG C A AAT A A 

MALFTIAIVGRPNVGKSTLFNRIAGERISIVEDVEGVTRDRIYATGEWLNRSFSMIDTGGIDDVDAPFMEQIKHOAEIAM 
EEADVrVFVVSGKEGITDADEYVARKLYKTHKPVILAVNKVDNPEMRNDIYDFYALGLGEPLPISSVHGIGTGDVLDAI 
VENLPNEYEEENPDVIKFSLIGRPNVGKSSUNAILGEDRVIASPVAGTTRDAIDTHFTDTDGQEFTMIDTAGMRKSGKV 
15 YEOTEKYSVMRAMRAIDRSDVVLMVINAEEGIREYDKRIAGFAHEAGKGMIIVVNKWDTLEKDNHTMKNWEEDIREQ 
FQYLPYAPIIFVSALTKORLHKLPEMIKQISESQNTRIPSAVLNDVIMDAIAINPTPTDKGKRLKIFYATOVATKPPTFVIFV 
NEEELMHFSYLRFLENQIRKAFVFEGTPIHLIARKRKZ 



20 



50 



60 



ID37 714bp 



GACTTAAATGAGATTTTGACAGCAGCCCAGATGGCATCATCTTGGAAGAATTTCCAATCCTACTCTGTGATTGTGG 
TAC GAAG TCAAGAGAAGAAAGATGCCTTGTATGAATTGGTACCTCAAGAAGCCATTCGCCAGTCTGCTGTTTTCCT 
TCTCITTGT CGG AG ATTTG A ACCG AG C AG A AAAGGG AG CCCG A CTTCATACCGACACCTTCCAACCCCAAGGTGT 

25 GGAAGGTCTCTTGATTAGTTCGGTCGATGCAGCTCnTGCTGGACAAAACGCCnTGTTGGCAGCTGA 

TATGGTGGT GTGAT TATCGGTTTGGTTCGATACAAGTCTGAAGAAGTGGCAGAGCTCTTTAACCTACCTGACTACA 
^CTA^CTGTCTTTGGGATGGCACTGGGTGTGCCAAATCAACATCATGATATGAAACCGAGACTGCCACTAGAGA 
ATGTTGTCFTTGAGGAAGAATACCAAGAACAGTCAACrGAGGCAATCCAAGCTTATGACCGTGTTCAGGCTGACT 
ATGCTGGGGCGCGTGCGACCACAAGCTGGAGTCAGCGCCTAGCAGAACAGTTTGGTCAAGCTGAACCAAGCTCAA 

3U CTAGAAAAAATCTTGAACAGAAGAAATTATTGTAG 

MTETIKLMKAHTSVRRFKEQEIPQVDLNEILTAAQMASSWKNFQSYSVIVVRSQEKKDALYELVPQEAIROSAVFLLFV 

' GDLNRAEKGARLHTDTFQPQGVEGLLISSVDAALAGQNALLAAESLGYGGVHGLVRYKSEEVAELFNLPDYTYSVFG 

MALGVPNQHHDMKPRLPLENVVFEEEYQEQSTEAIQAYDRVQADYAGARATTSWSQRLAEQFGQAEPSSTRKNLEQK 
•30 K.L.LZ 

IP38 729bp 

ATGACAGAAATTAGACTAGAGCACGTCAGTTATGCCTATGGTCAGGAGAGGATTTTAGAGGATATCAACCTACAG 
4U GTG ACTT CAGGCGAAGTGGTTTCCATCCTAGGCCCAAGTGGTGTTGGAAAGACCACCCTCTTTAATCTAATCGCTG 
GGATTTTAGAAGTTCAGTCAGGGAGAATTGTCCTTGATGGTGAAGAAAATCCCAAGGGGCGCGTGAGTTATATGT 
TGCAAAAGGATCrGCTCTTGGAGCACAAGACGGTGCTTGGAAATATCATTCTGCCCCTCTTGATrCAA 
ATAAGGCAGAAGCTATTTCCCGAGCGGATAAAATTCTTGCGACCTTCCAGCTGACAGCTGTAAGAGACAAGTATC 
CTCATGAACITAGCGGTGGGATGCGCCAGCGTGTAGCCT 
45 CTTAGATGAGGCCITTAGCGCCTTGGATGAGATGACAAAGATGGAACTCCACGCTTGGTATCTTGAGATTCACAA 
GCAGTTGCAGCTAACAACCCTGATCATCACGCATAGTATTGAGGAGGCCCTCAATCTCAGCGACCGTATCTATATC 
TTGAAAAATCGCCCTGGGCAGATTGTTTCAGAAATTAAACTAGATTGGTCTGAAGATGAGGACAAGGAAGTCCAA 
AAGATTGCCTACAAACGTCAAATTTTGGCGGAATTAGGCTTAGATAAGTAG 



MTEIRLEHVSYAYGQERILEDINLQVTSGEVVSILGPSGVGKTTLFNL1AGILEVQSGRIVLDGEENPKGRVSYMLQKDLL 
LEHKTVLGNIILPLLIQKVDKAEAISRADKILATFQLTAVRDKYPHELSGGMRQRVALLRTYLFGHKLFLLDEAFSALDE 
MTKMELHAWYLEIHKQLQLTTLIITHSIEEALNLSDRIYILKNRPGQIVSEIKLDWSEDEDKEVQKIAYKRQILAELGLDK 



55 IP39 2433bp 



ATGAACTATTCAAAAGCATTGAATGAATGTATCGAAAGTGCCTACATGGrTGCTGGACATTTTGGAGCTCGTTATC 
TAGAGTCGTGGCACTTGTTGATTGCCATGTCTAATCACAGTTATAGTGTAGCAGGGGCAACTTTAAATGATTATCC 
GTATGAGATGGACCGTTTAGAAGAGGTGGCTTTGGAACTGACTGAAACGGACTATAGCCAGGATGAAACCTTTAC 
GGAATTGCCGTTCTCCCGTCGTTTGCAGGTTCTTTTTGATGAAGCAGAGTATGTAGCGTCAGTGGTCCATGCTAAG 
GTACT AGGGA CAGAGCACGTCCTCTATGCGATTTTGCATGATAGCAATGCCTTGGCGACTCGTATCTTGGAGAGG 
GCTGGi I i I TCI 1ATGAAGACAAGAAAGATCAGGTCAAGATTGCTGCTCTTCGTCGAAATTTAGAAGAACGGGCA 
GGCTGGACTCGTGAAGATCTCAAGGCITTACGCCAACGCCATCGTACAGTAGCTGACAAGCAAAATTCTATGGCC 
AATATGATGGGCATGCCGCAGACTCCTAGTGGTGGTCTCGAGGATTATACGCATGATTTGACAGAGCAAGCGCGT 
OD TCTGGCAAGTTAGAACCAGTCATCGGTCGGGACAAGGAAATCTCACGTATGATTCAAATCTTGAGCCGGAAGACT 
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AAGAACAACCCTGTCTTGGTTGGGGATGCTGGTGTCGGGAAAACAGCTCTGGCGCTTGGTCTTGCCCAGCGTATTG 
CTAGTGGTGACGTGCCTGCGGAAATGGCTAAGATGCGCGTGTTAGAACTTGATTTGATGAATGTCGTTGCAGGGA 
CACGCTTCCGTGGTGACTTTGAAGAACGCATGAATAATATCATCAAGGATATTGAAGAAGATGGCCAAGTCATCC 
TCITTATCGATGAACTCCACACCATCATGGGTTCTGGTAGCGGGATTGATTCGACTCTGGATGCGGC 
5 GAAACCAGCCTTGGCGCGTGGAACTTTGAGAACGGTTGGTGCCACTACTCAGGAAGAATATCAAAAACATATCGA 
AAAAGATGCGGCACTTTCTCGTCGTTTCGCTAAAGTGACGATTGAAGAACCAAGTGTGGCAGATAGTATGACTAT 
TTTACAAGGTTTGAAGGCGACTTATGAGAAACATCACCGTGTACAAATCACAGATGAAGCGGTTGAAACAGCGGT 
TAAGATGGCrCATCGTTATTTAACCAGTCGTCACTTGCCAGACTCTGCTATCGATCTCTTGGATGAGGCGGCAGCA 
ACAGTGCAAAATAAGGCAAAGCATGTAAAAGCAGACGATTCAGATTTGAGTCCAGCTGACAAGGCCCTGATGGAT 

10 GGCAAGTGGAAACAGGCAGCCCAGCTAATCGCAAAAGAAGAGGAAGTACCTGTCTACAAAGACTTGGTGACAGA 
GTCTGATATTTTGACCACCTTGAGTCGCTTGTCAGGAATCCCAGTTCAAAAACTGACTCAAACGGATGCTAAGAAG 
TATTTAAATCnTGAAGCAGAACTCCATAAACGGGTTATCGGTCAAGATCAAGCTGTITCAAGCATTAGCCGTGCCA 
TTCGCCGCAACCAGTCAGGGATTCGCAGTCATAAGCGTCCGATTGGTTCCTTTATGTTCCTAGGGCCTACAGGTGT 
CGGGAAAACTGAATTAGCCAAGGCTCTGGCAGAAGTTCTTTTTGACGACGAATCAGCCCTTATCC^ 

15 AGTGAGTATATGGAGAAATTTGCAGCTAGTCGTCTCAACGGAGCTCCTCCAGGCTATGTAGGATATGAAGAAGGT 
GGGGAGTTGACAGAGAAGGTTCGCAATAAACCCTATTCCGTTCTCCTCTTTGATGAGGTAGAGAAGGCCCACCCA 
G AT ATCTTTAATGTTCTCTTGCAGGTTCTGGATGACGGTGTCnTGACAG AT AGCAAGGGACGCAAGGTCG A I ' l ' I 
CAAATACCATTATCATTATGACATCGAATCTAGGTGCGACTGCCCTTCGTGATGATAAGACTGTTGGTTTTGGGGC 
TAAGGATATTCGTTTTGACCAGGAAAATATGGAAAAACGCATGTTTGAAGAACTGAAAAAAGCTTATAGACCGGA 

20 ATTCATCAACCGTATTGATGAGAAGGTGGTCTTCCATAGCCTATCTAGTGATCATATGCAGGAAGTGGTGAAGATT 
ATGGTCAAGCCnTTAGTGGCAAGTTTGACTGAAAAAGGCATTGACTTGAAuATTACAAGCTTCAGCTCT 
TAGCAAATCAAGGATATGACCCAGAGATGGGAGCTCGCCCACTTCGCAGAACCCTGCAAACAGAAGTGGAGGAC 
AAGTTGGCAGAACTTCTTCTCAAGGGAGATTTAGTGGCAGGCAGCACACTTAAGATTGGTGTCAAAGCAGGCCAG 
TT AAAATTTG ATATTG C AT AA 

25 

MNYSKALNECIESAYMVAGHFGARYLESWHLL1AMSNHSYSVAGATLNDYPYEMDRLEEVALELTETDYSQDETFTE 
LPFSRRLQVLFDEAEYVASVVHAKVLGTEHVLYAILHDSNALATRILERAGFSYEDKIO)Q\nClAALRRNLEERAGWTR 
EDLKALRQRHRTVADKQNSMANMMGMPQTPSGGLEDYTHDLTEQARSGKLEPVIGRDKEISRMIQILSRKTKNNPVLV 
GDAGVGKTALALGLAQRIASGDVPAEMAKMRVLELDLMNVVAGTRFRGDFEERMNNIIKDIEEDGQVILFIDELHTIM 

30 GSGSGIDSTLDAANILKPALARGTLRTVGATTQEEYQKHIEKDAALSRRFAKVTIEEPSVADSMTILQGLICATYEKHHRV 
QITDEAVETAVKMAHRYLTSRHLPDSAIDLLDEAAATVQNKAKHVKADDSDI^PADKALMDGKWKQAAQLIAKEEEV 
PVYKDLVTESD1LTTLSRLSGIPVQKLTQTDAKKYLNLEAELHKRVIGQDQAVSSISRAIRRNQSGIRSHKRPIGSFMFLGP 
TGVGKTELAKALAEVLFDDESALIRFDMSEYMEKFAASRLNGAPPGYVGYEEGGELTEKVRNKPYSVLLFDEVEKAHP 
DIFNVLLQVLDDGVLTDSKGRKVDFSNTIIIMTSNLGATALRDDKTVGFGAKDIRFDQENMEKRMFEELKKAYRPEFIN 

35 RIDEKVVFHSLSSDHMQEVVKIMVKPLVASLTEKGIDLKLQASALKLLANQGYDPEMGARPLRRTLQTEVEDKLAELL 
LKGDLVAGSTLKIGVKAGQLKFDIAZ 

ID40 1008hp 

40 ATGAAGAAAACATGGAAAGTGTTTTTAACGCTTGTAACAGCTCTTC 

GAACTGCITCT AAAG ACAACAAAGAGGCAGAACTTAAGAAGGTTGACTTTATCCTAGACTGGACACCAAATACCA 
ACCACACAGGGCTTTATGTTGCCAAGGAAAAAGGTTATTTCAAAGAAGCTGGAGTGGATGTTGATTTGAAj\TTGC 
CACCAGAAGAAAGTTCTTCTGACTrGGTTATCAACGGAAAGGCACCATTTGCAGTGTATTTCCAAGACTACATGGC 
TAAGAAATTGGAAAAAGGAGCAGGAATCACTGCCGTTGCAGCTATTGTTGAACACAATACATCAGGAATCATCTC 

45 TCGTAAATCTGATAATGTAAGCAGTCCAAAAGACTTGGTTGGTAAGAAATATGGGACATGGAATGACCCAACTGA 
ACTTGCTATGTTGAAAACCTTGGTAGAATCTCAAGGTGGAGACTTTGAGAAGGTTGAAAAAGTACCAAATAACGA 
CTCAAACTCAATCACACCGATTGCCAATGGCGTCTTTGATACTGCTTGGATTTACTACGGTTGGGA 
GCTAAATCTCAAGGTGTAGATGCTAACTTCATGTACTTGAAAGACTATGTCAAGGAGTTTGACTACT 
TTATCATCGCAAACAACGACTATCTGAAAGATAACAAAGAAGAAGCTCGCAAAGTCATCCAAGCCATCAAAAAA 

50 GGCTACCAATATGCCATGGAACATCCAGAAGAAGCTGCAGATATTCTCATCAAGAATGCACCTGAACTCAAGGAA 
AAACGTGACTTTGTCATCGAATCTCAAAAATACTTGTCAAAAGAATACGCAAGCGACAAGGAAAAATGGGGTCAA 
TTTGACGCAGCTCGCTGGAATGCTTTCTACAAATGGGATAAAGAAAATGGTATCCTTAAAGAAGACTTGACAGAC 
AAAGGCTTCACCAACGAATTTGTGAAATAA 

55 

MKKTWKVFLTLVTALVAVVLVACGQGTASKDNKEAELKKVDFILDWTPNTNHTGLYVAKEKGYFKEAGVDVDLKLP 

PEESSSDLVINGKAPFAVYFQDYMAKKLEKGAGITAVAAIVEHNTSGIISRKSDNVSSPKDLVGKKYGTWNDPTELAML 

KTLVESQGGDFEKVEKVPNNDSNSITPIANGVFDTAWIYYGWDGILAKSQGVDANFMYLKDYVKEFDYYSPVIIANND 

YLKDNKEEARKVIQAIKKGYQYAMEHPEEAADILIKNAPELKEKRDFVIESQKYLSKEYASDKEKWGQFDAARWNAFY 

KWDKENGILKEDLTDKGFTNEFVKZ 

60 

IP41 762bp 

TTGATGAGAA ACTT GAGAAGTATACTGAGACGACACATTAGTCTATTGGGCITTCTCGGAGTATTGTCAATCT 
AGTTAGCAGGTTTTCTTAAACTTCTCCCCAAGTTTATCCTGCCGACACCTCITGAAATTCT 
OD GACAGAGAATTTCTCTGGCACCATAGCTGGGCGACCrrGAGAGTGGCTTTACTGGGGCTGATTTTGGGA 
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TTGCCTGTCTTATGGCTGTGCTCATGGATAGTTTGACTTGGCTCAATGAC 

CAGACCATTCCGACCATTGCCATAGCTCCTATCCTGGTCTTGTGGCTAGGTTATGGGATTTTGCCCAAGATTGTCT 
TGATTATCITAACGACAACCTTTCCCATCATCGTTAGTATTTTGGACGGTTITAGG 

GACCTTGTTTAGTCTGATGCGGGCCAAGCCTTGGCAAATCCTGTGGCATTTTAAAATCCCAGTTAGCCTGCCTTAC 
5 TTTTATGCAGGTCTGAGGGTCAGTGTCTCCTACGCCTTTATCACAACTGTGGTATCTGAGTGGT^ 

AAGGTCnTGGTGTTTATATGATTCAGTCTAAAAAACTGTTTCAGTATGATACCATGTTTGCCATT ATTATTCTGGTG 
TCGATTATCAGTCTTTTGGGTATGAAGCTGGTCGATATCAGTGAAAAATATGTGATTAAATGGAAACGTTCGTAG 

MMRNLRSILRRHISLLGFLGVLSIWQLAGFLKLLPKFILPTPLEILQPFVRDREFLWHHSWATLRVALLGLILGVLIACLM 
10 AVLMDSLTWLNDLIYPMMVVIQTIPT1AIAPILVLWLGYGIL 

WOILWHFKIPVSLPYFYAGLRVSVSYAFrrrVVSEWLGGFEGLGVYMIQSKKLFQYDTMFAIIILVSIISLLGMKLVDISE 
KYVIKWKRSZ 

ID42 372bp 

TTGATXTTTAATCCTATTTGCTGTATGATAAGGGAAAAGAAAGGGGACAGAGATATGGCTTTTACCAATACCCACA 
TGCGATCTGCTAGTTTTGGTATTGTTACCAGCTTGCCTGATGACATCATTGACTClM'rriGGTATATCATCGACCAT 
TTCTTAAAAAATGTCTTTGAATTGGAAGAAGAACTCGAGTTTCAATTC 
ACTTTTCAAGTCAACACCTCCCTACAGCCATTGATTTTGACTTTAACCATCCTT^ 
20 GTACTGGTTTTAGACATGGACGGTAGAGAAACTATCCTCCTCCCAGAAGAAAATGACCTATTTTAA 

MIFNPlCCMIREKKGDRDMAFrm*HMRSASFGIVTSLPDDIIDSFWYIIDHFLKNVFELEEELEFQLLNNOGKITFHFSSQ 
HLPTAIDFDFNHPFDPRYPPRVLVLDMDGRETILLPEENDLFZ 

25 ID43 1569bp 

ACAGCGGT GTCA TTCTATCTATTTTAAGAAAAGTAATAATCAATTGTTAAA 

ATGAA ATATT TTGTTCCTAATGAGGTATTCAGTATTCGTAAATTAAAGGTGGGGACTTGCTCGGTACTATTGG 
TTTCAATTTTGGGAAGCCAAGGTATTTTATCGGATGAAGTTGTTACTAGTTCITCACCGATGGCT 

30 TTCTAATGCAATTACTAATGATTTAGATAATTCACCAACTGTTAATCAGAATCGTTCTGCTGAAATGATTGCCT 
AATTCAACCACTAATGGTTTAGATAATTCGTTAAGTGTTAATAGCATCAGCTCTAATGGTACTATTCGT^ 
CACAATTAGACAACAGAACAGTTGAATCTACAGTAACATCTACTAATGAAAATAAGAGTTATAAGGAAGATGTTA 
TAAGTGACAGAATTATCAAAAAAGAATTTGAAGATACTGCTTTAAGTGTAAAAGATTATGGTGCAGTAGGTGATG 
G GATT CATGATGATCGACAAGCAATTCAAGATGCAATAGATGCTGCAGCTCAAGGGCTAGGTGGAGGAAATGTAT 

35 ATTTTCCTG A AGG AACTT ATTT AGT AA A AG A AATTG ITT I T II AA AAAGTC ATACACACTTAG AATTG A ATG AG AA 

AGCTACAATTCTAAATGGTATAAATATTAAGAATCACCCTTCCATTGTTTTTATGACAGGTTTATTTACGGATGAT 
GGTGCGCAAGTAGAATGGGGCCCAACAGAAGATATTAGTTATTCTGGTGGTACGATTGATATGAACGGTGCTTTG 
AATGAAGAAGGAACTAAAGCAAAAAATCTACCACTTATAAATTCTTCAGGTGCATTTGCTATTGGGAATTCAAAT 
AACGTAACTATAAAAAATGTAACATTCAAGGATAGTTATCAAGGGCATGCTATTCAAATTGCAGGTTCGAAAAAT 

40 GTATTAGTTGATAATTCTCGTTTTCTTGGGCAAGCCTTACCCAAAACGATGAAGGATGGGCAAATCATAAGTAAGG 
AGAGCATTCAGATTGAACCATTAACTAGAAAAGGTTTTCCTTATGCCTTGAATGATGATGGGAAAAAATCTGAAA 
ATGTGACTATTCAAAATTCCTATTTTGGCAAAAGTGATAAATCTGGGGAATTAGTAACAGCAATTGGCACACACTA 
TCAAA CATT GTCGACACAGAACCCCTCTAATATTAAAATTCAAAATAATCATTTTGATAACATGATGTATGCAGGT 
GTACGTTTTACAGGATTCACTGATGTATTAATCAAAGGAAATCGCITTGATAAGAAAGTTAAAGGAGAGAGTGTA 

45 CATTATCGAGAAAGCGGAGCAGCTTTAGTAAATGCTTATAGCTATAAAAACACTAAAGACCTATTAGATTTAAAT 
AAACAGGTGGTTATCGCCGAAAATATATTTAATATTGCCGATCCTAAAACAAAAGCGATACGAGTTGCAAAAGAT 
AGTGCAGAATGTTTAGGAAAAGTATCAGATATTACTGTAACAAAAAATGTAATTAATAATAATTCTAAGGAAAGA 
GAACAACCAAATATTGAATTATTACGAGTTAGTGATAATTTAGTAGTCTCAGAGAATAGT 

50 QRCHSIYFKKSNNQLLKIVKKLEVLMKYFVPNEVFSIRKLKVGTCSVLLAISILGSQGILSDEVVTSSSPMATKESSNAITN 
DLDNSPTVNQNRSAEMIASNSTTNGLDNSLSVNSISSNGTIRSNSQLDNRTVESTVTSTNENKSYKEDVISDRJIKKEFEDT 
ALSVKDYGAVGDGiHDDRQAIQDAlDAAAQGLGGGNVYFPEGTYLVKEIVFLKSHTHLELNEKATILNGINIKNHPSIVF 
MTGLFTDDGAQVEWGPTEDISYSGGTIDMNGALNEEGTKAKNLPLINSSGAFAIGNSNNVTIKNVTFKDSYQGHAIQIA 
GSKNVLVDNSRFLGQALPKTMKDGQIISKESIOIEPLTRKGFPYALNDDGKKSENVTIONSYFGKSDKSGELVTAIGTHY 

DD QTLSTQNPSNIKIQNNHFDNMMYAGVRFTGFTDVLIKGNRFDKKVKGESVHYRESGAALVNAYSYKNTKDLLDLNKQ 
VV1AENIFNIADPKTKAIRVAKDSAECLGKVSDITVTKNVINNNSKETEQPNIELLRVSDNLVVSENS 

ID44 324bD 

60 GTGATGAAAGAAACTCAGCTA TTAAA AGGTGTTCTTGAAGGTTGTGTCTTGGATATGATTGGTCAAAAAGAGCGG 
TATGGTTATGAGTTGGTTCAGACTTTGCGAGAGGCTGGATTTGATACT 

GCAAAAGTTAGAAAAAAATCAATGGATAAGAGGCGACATGCGCCCGTCGCCAGATGGTCCAGATCGGAAGTATTT 
TTCATTAATGAAAGAAGGAGAAGAGCGTGTCTCAGTCTTTTGGCAACAATGGGACGATTTGAGTCAAAAAGTAGA 
AGGGATTAAGAATGGGGGTTAA 
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MMKETQLLKGVLEGCVLDMIGOKERYGYELVQTLREAGFDTIVPGTIYPLLQKLEKNQWIRGDMRPSPDGPDRKYFSL 
MKEGEERVSVFWQQWDDLSQKVEGIKNGGZ 

ID45 816bp 

5 

ATGAAGAAAATGAAGTA TTAC GAAGAAACAAGCGCTTTGCrACATGAGTTTTCTGAGGAGAATCAAAAGTATTTT 
GAGGAGTTGTGGGAAAGTTTTAATCTTGCTGGATTTCTCTATGATGAA 

TGATGCTAGATTTCTCAGAAGCAGAACGAGATGGCATGAGTGCAGAGGATTATCTAGGTAAGAATCCTAAAAAAA 
TAATGAAAGAGATTCTCAAGGGAGCACCTCGCAGTTCTATCAAAGAGTCCCnTTTGACGCCAATTCTTGTCCTGGC 

10 GGTATTACGTTA TTATC AACTACTAAGTGATTTTTCTAAAGGTCCTCTCTTAACAGTCAATTTGCTCACA 
GGCAACTTCTTATTTTTCTGATTGGATTTGGACTTGTGGCCACAATTTTACGAAGAAGTTT 
AAAATGAAAATTGGCACTT ACATT GTTGTTGGGACTATAGTTCTTCTAGTTGTTTTAGGATAT 
GCTTCATACAAGAAGGAGCCTITTATATTCCGGCTCCCTGGGATAGTTTGTCTGTCTTTACGATTT 
GGTATTTGGAATTGGAAAGAAGCGGTCTTTCGTCCATTTGTCAGTATGATTATTGCCCATCTTGTGGTGGGTTCTCT 

15 GCTC CGTT ATTATGAGTGGATGGGAATTTCAAATGTTTTCC 

GAATCTTTGTCTTGTTCCGTGGGTTTAAGAAGATAAAATGGAGTGAAGTATAG 

MKKMKYYEETSALLHEFSEENQKYFEELWESFNLAGFLYDEDYLREQIYLMMLDFSEAERDGMSAEDYLGKNPKKIM 
KEILKGAPRSSIiCESLLTPILVLAVLRYYQLLSDFSKGPLLTVNLLTFLGQLUFLIGFGLVATILRRSLVQDSPKMKIGTYI 
20 VVGTIVLLVVLGYVGMASFIQEGAFYIPAPWDSLSVFTISLVIGIWNWKEAVFRPFVSMIIAHLVVGSLLRYYEWMGISN 
VFLTKVIPLAVLFIGIFVLFRGFKKIKWSEV2 

ID46 348bp 

25 CTGi i l l I I 1 ATTTATACTCAATGAAAATCAAAGAGCAAACrAGGAAGCTAGCCGCAGGTTGCTCAAAACACTGTT 

TTGAGGTTGTAGA CGAA ACTGACGAAGTCAGCTCAAAACATGTTTTTGAGGTTGTAGATGAAACTGACGAAGTCA 
GCTCAAAACACTGTTTTGAGGTTGTAGATGAAACTGACGAAGTCAGCTCAAAACACTGTTTT 

AAACTGACGAAGTCAGCTCAAAACATGTTTTTGAGGTTGTAGATGAAACTGACGAAGTCAGTAACCATACATACG 
GTAGGGCGACGCTGACGTGGTTTGAAGAGATTTTCGAAGAGTATTAA 



30 



35 



40 



55 



60 



MFFYLYSMKIKEQTRKLAAGCSKHCFEVVDETDEVSSKHVFEVVDETDEVSSKHCFEVVDETDEVSSKHCFEVVDETD 
EVSSKHVFEVVDETDEVSNHTYGRATLTWFEEIFEEYZ 

ID47 1260bp 



ATGCAGAATCTGAAATTTGCCTTTTCATCTATCATGGCTCACAAGATGCGTTCTTTGCTTACT^ 
TATCGGTGTTTCATCAGTTGTT GTGATTA TGGCTTTGGGTGATTCCCTATCTCGTCAAGTCAATAAAGATATGACTA 
AATCTCAGAAAAATATTAGCGTCTTTTTCTCTCCTAAAAAAAGTAAAG 

TTTTACGGTTTCTGGAAAGGAAGAGGAAGTTCCTGTTGAACCGCCAAAACCGCAAGAATCCTGGGTCCAAGAGGC 
AGCTAAACTGAAGGGAGTGGATAGTTACTATGTAACCAATTCAACGAATGCCATCTTGACCTATCAAGATAAAAA 
GGTTGAGAATGCTAATTTGACAGGTGGAAACAGAACTTACATGGACGCTGTTAAGAATGAAATTATTGCAGGTCG 
TAGTCTGAGAGAGCAAGATTTCAAAGAGTTTGCAAGTGTCATTTTGCTAGATGAGGAATTGTCCATTAGTTTATTT 
GAATCTCCTCAAGAGGCTATTAACAAGGTTGTAGAAGTCAATGGATTTAGTTACCGGGTCATTGGGGTTTATACTA 
a * GTCCGGAGGCTAAAAGATCAAAAATATATGGGTTTGGTGGCTTGCCTATTACTACCAATATCT 
4!> TTTTAATGTAGATGAAATAGCTAATATTGTCTTTCGAGTGAATGATACCAGTTTAACCCCAACTCTGGGTCCAGAA 
CTGGCACGAAAAATGACAGAGC TTGCA GGCTTACAACAGGGAGAATACCAGGTGGCAGATGAGTCCGTTGTATTT 
GCAGAAATTCAACAATCGTTTAGTTTTATGACGACGATTATTAGTT^ 

GAACTGGTGTCATGAAC ATCA TGCTGGTTTCGGTGACAGAGCGCACTCGTGAGATTGGTCTTCGTAAGGCTTTGGG 
TGCAACACGTGCCAATATTTTAATTCAGTTTTTGATTGAATCCATGATTTTGACCT^ 
DU TGACAATTGCAAGTGGTTTAACTGCCTTAGCAGGTTTGTTACTGCAAGGTTTAATAGAAGGTATAGAAGTTGGAGT 
ATCAATCCCAGTCGCCCTATTTAGTCTTGCAGTTTCGGCTAGTGTTGGTATGATT7TTGGAGTCTTGCCAGCCAAC 
AAGGCATCGAAACTTGATCCAATTGAAGCCCTTCGTTATGAATGA 



MQNLKFAFSSIMAHKMR5LLTMIGIIIGVSSVVVIMALGDSLSRQVNKDMTKSQKNISVFFSPKKSKDGSFTQKQSAFTVS 

GKEEEVPVEPPKPQESWVQEAAKLKGVDSYYVTNSTNAILTYODKKVENANLTGGNRTYMDAVKNEIIAGRSLREODF 

KEFASVILLDEELSISLFESPQEAINKVVEVNGFSYRVIGVYTSPEAKRSKIYGFGGLPITTNISLAANFNVDEIANIVFRVN 

DTSLTPTLGPELARKMTELAGLQQGEYOVADESVVFAEIQQSFSFMTTIISSIAGISLFVGGTGVMNIMLVSVTERTREIG 

LRKALGATRANILIQFLIESMILTLLGGLIGLTIASGLTALAGLLLQGLIEGIEVGVSIPVALFSLAVSASVGMIFGVLPANK 
ASKLDPIEALRYEZ 

ID48 70Sbp 



CTGATGAAGCAACrAATTAGTCrAAAAAATATCTTCAGAAGTTACCGTAATGGTGACCAAGAACTGCAGGTTCrC 
AAAAATATCAATCTAGAAGTGAATGAGGGTGAATTTGTAGCCATCATGGGACCATCTGGGTCTGGTAAGTCCACT 
OD CTGATGAATACGATTGGCATGTTGGATACACCAACCAGTGGAGAATATTATCTTGAAGGTCAAGAAGTGGCTGGG 
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CTTGGTG AA A A AC AACTAGCTA AGGTCCGT A ACCA AC A A ATCGGTTTTGTCTTTC AGC AGTTC 1 " 1 " I CTTCTATCG A 

AGCTCAATGCTCTGCAAAATGTAGAATTGCCCTTGATTTACGCAGGAGTTTCGTCTTCAAAACGTCGCA-AGTTGGC 

TGAGGAATATTTAGACAAGGTTGAATTGACAGAACGTAGTCACCATTTACCTTCAGAATTATCTGGTGGTCAAAA 

GCAACGTGTAGCCATTGCGCGTGCCTTGGTAAACAATCCTTCTATTATCCTAGCGGATGAACCGACAGGAGCCTTG 

GATACCAAAACAGGTAACCAAATTATGCAATTATTGGTTGATTTGAATAAAGAAGGAAAAACCATTATCATGGTA 

ACGCATGAGCCTGAGATTGCTGCCTATGCCAAACGTCAGATTGTCATTCGGGATGGGGTCATTTCGTCTGACAGTG 

CTCAGTTAGGAAAGGAGGAAAACTAA 

MMKOLISLKNIFRSYRNGDQELQVLKNINLEVNEGEFVAIMGPSGSGKSTLMNTIGMLDTPTSGEYYLEGQEVAGLGEK 
QLAKVRNQQIGFVFQQFFLLSKLNALQNVELPLIYAGVSSSKRRKLAEEYLDKVELTERSHHLPSELSGGQKQRVAIARA 
LVNNPSIILADEPTGALDTKTGNQIMQLLVDLNKEGKTIIMVTHEPE1AAYAKRQIVIRDGVISSDSAQLGKEENZ 

IP49 1200bp 

ATGAAGAAAAAGAATGGTAAAGCTAAAAAGTGGCAACTGTATGCAGCAATCGGTGCTGCGAGTGTAGTTGTATTG 

GGTGCTGGGGGGATTTTACrCTTTAGACAACCTTCTCAGACTGCTCTAAAAGATGAGCCTACT 

CCAAGGAAGGAAGCGTGGCCTCCTCTGTTTTATTGTCAGGGACAGTAACAGCAAAAAATGAACAATATGTTTATT 

TTGATGCrAGTAAGGGTGATTTAGATGAAATCCTTGTTTCTGTGGGCGATAAGGTCAGCGAAGGGCAGGCTTTAG 

CAAGTACAGTAGTTCAGAAGCGCAGGCGGCCTATGATTCAGCTAGTCGAGCAGTAGCTAGGGCAGATCGTCATAT 

CAATGAACTCAATCAAGCACGAAATGAAGCCGCTTCAGCTCCGGCTCCACAGTTACCAGCGCCAGTAGGAGGAGA 

A G ATG C A A CG GTGG A A AG CCC A ACTCC AGTG G- CTGG AA ATTCTGTTG GTTGT- ATTG AGG GTGA ATTG G GTG ATG CC 

CGTGATGCGCGTGCAGATGCTGCGGCGCAATTAAGCAAGGCTCAAAGTCAATTGGATGCAACAACTGTTCTCAGT 

ACCCTAGAGGGAACTGTGGTCGAAGTCAATAGCAATGTTTCTAAATCTCCAACAGGGGCGAGTCAAGTTATGGTT 

CATATTGTCA GCAAT GAAAATTTACAAGTCAAGGGAGAATTGTCTGAGTACAATCTAGCCAACCTTTCTGTAGGTC 

AAGAAGTAAGCTTTACITCrAAAGTGTATCCrGATAAAAAATGGACrGGGAAATTAAGCrATATT^ 

TAAAAACAATGGTGAAGCAGCTAGTCCAGCAGCCGGGAATAATACAGGTTCTAAATACCCTTATACTATTGATGT 

GACAGGCGAGGTTGGTGATTTGAAACAAGGTTTTrCTGTCAACATTGAGGTTAAAAGCAAAACTAAGG 

GTTCCrGTTAGCAGTCTAGTAATGGATGATAGTAAAAATTATGTCrGGATTGTGGATGAACAACAAAAGGCTA 

AAAGTTGAGGTTTCATTGGGAAATGCTGACGCAGAAAATCAAGAAATCACITCrGGTTTAACGAACGGTGCTAAG 

GTCATCAGTAATCCAACATCITCCTTGGAAGAAGGAAAAGAGGTGAAGGCTGATGAAGCAACTAATTAG 

MKKKNGKAKKWQLYAAIGAASVWLGAGGILLFRQPSQTALKDEFTHLVVAKEGSVASSVLl^GTVTAKNEQYVYFD 

ASKGDLDEILVSVGDKVSEGQALVKYSSSEAQAAYDSASRAVARADRHINELNQARNEAASAPAPQLPAPVGGEDATV 

QSPTPVAGNSVASIDAQLGDARDARADAAAQLSKAQSQLDATTVLSTLEGTVVEVNSNVSKSPTGASQVMVHIVSNEN 

LQVKGEl^EYNLANLSVGQEVSFTSKVYPDKKWTGKLSYISDYPKNNGEAASPAAGNNTGSKYPYTIDVTGEVGDLKQ 

GFSVNIEVKSKTKAILVPVSSLVMDDSKNYVWIVDEQQKAKKVEVSLGNADAENQEITSGLTNGAKVISNPTSSLEEGKE 

VKADEATNZ 

IPSO 759bo 

ATGTCACGTAAACCATTTATCGCTGGTAACTGGAAAATGAACAAAAATCCAGAAGAAGCTAAAGCATTCGTTGAA 
GCAGTTGCATCAAAACTTCCITCATCAGATCrrGTTGAAGCAGGTATCGCTGCTCCAGCrCTTGATTTGACAACTG 
TTCTTGCTGTTGCAAAAGGCTCAAACCTTAAAGTTGCTGCTCAAAACTGCT 

TGGTGAAACTAGCCCACAAGTTTTGAAAGAAATCGGTACrGACTACGTTGTTATCGGTCACTCAGAACGCCGTGA 

CTACTTCCATGAAACTGATGAAGATATCAACAAAAAAGCAAAAGCAATCTTTGCGAACGGTATGCTTCCAATCAT 

CTGTTGTGGTGAATCACTTGAAACITACGAAGCrGGTAAAGCTGCrGAATTCGTAGGTGCTCAAGTATCrGCrGCA 

TTGGCTGGATTGACTGCrGAACAAGTTGCTGCCTCAGTTATCGCnTATGAGCCAATCrGGGCTATCGGTACTGGTA 

AATCAGCITCACAAGACGATGCACAAAAAATGTGTAAAGTTGTTCGTGACGTTGTAGCTGCTGACTTTGGTCAAG 

AAGTCGCAGACAAAGTTCGTGTTCAATACGGTGGTTCTGTTAAACCTGAAAATGTTGCTTCATACATGGCTTGCCC 

AGACGTTGACGGTGCCCTTGTAGGTGGTGCGTCACrTGAAGCTGAAAGCITCTTGGCTTTG 

TAA 

MSRKPFIAGNWKMNKNPEEAKAFVEAVASKLPSSDLVEAGIAAPALDLTTVLAVAKGSNLKVAAQNCYFENAGAFTG 
ETSPQVLKEIGTDYVVIGHSERRDYFHETDEDINKKAKAIFANGMLPIICCGESLETYEAGKAAEFVGAQVSAALAGLTA 
EQVAASVIAYEPIWAIGTGKSASQDDAQKMCKVVRDVVAADFGQEVADKVRVQYGGSVKPENVASYMACPDVDGAL 
VGGASLEAESFLALLDFVKZ 

1D5J 1473bp 

TTGAAAACAAAAATTGGATTAGCAAGTATCrGTTTACTAGGCTTGGCAACTAGTCATGTCGCTGCAAATGAAACTG 

AAGTAGCAAAAACITCGCAGGATACAACGACAGCTTCAAGTAGTTCAGAGCAAAATCAGTCTTCTAATAAAACGC 

AAACGAGCGCAGAAGTACAGACTAATGCTGCTGCCCACTGGGATGGGGATTATTATGTAAAGGATGATGGTTCTA 

AAGCrCAAAGTGAATGGATTTTTGACAACTACTATAAGGCTTGGTTTTATATTAATTCAGATGGTCGTTACTCGCA 

GAATGAATGGCA TGGAA ATTACTACCTGAAATCAGGTGGATATATGGCCCAAAACGAGTGGATCTATGACAGTAA 

TTACAAGAGTTGGTTTTATCTCAAGTCAGATGGGGCTTATGCTCATCAAGAATGGCAATTGATTGGAAATAAGTGG 
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TACTACTTCAAGAAGTGGGGTTACATGGCTAAAAGCCAATGGCAAGGAAGTTATTTCTTGAATGGTCAAGGAGCT 

ATGATGCAAAATGAATGGCTCTATGATCCAGCCTATTCTGCTTAT'niTATCTAAAATCCGATGGAACTTATGCTA 

ACCAAGAGTGGCAAAAAGTGGGCGGCAAATGGTACTATTTCAAGAAGTGGGGCTATATGGCTCGGAATGAGTGGC 

AAGGCAACTACTATTTGACTGGAAGTGGTGCCATGGCGACTGACGAAGTGATTATGGATGGTACTCGCTATATCTT 

TGCGGCCTCTGGTGAGCTCAAAGAAAAAAAAGATTTGAATGTCGGCTGGGTTCACAGAGATGGTAAGCGCTATTT 

CTTTAATAATAGAGAAGAACAAGTGGGAACCGAACATGCTAAGAAAGTCATTGATATTAGTGAGCACAATGGTCG 

TATCAATGATTGGAAAAAGGTTATTGATGAGAACGAAGTGGATGGTGTCATTGTTCGTCTAGGTTATAGCGGTAA 

AGAAGACAAGGAATTGGCGCATAACATTAAGGAGTTAAACCGTCTGGGAATTCCITATGGTGTCTATCTCT 

CTATGCTGAAAATGAGACCGTGCTGAGAGTGACGCTAAACAGACCATTGAACTTATAAAGAAATACAATATGAAC 

CTGTCTTACCCTATCTATTATGATGTTGAGAATTGGGAATATGTAAATAAGAGCAAGAGAGCTCCAAGTGATACA 

GGCACTTGGGTTAAAATCATCAACAAGTACATGGACACGATGAAGCAGGCGGGTTATCAAAATGTGTATGTCTAT 

AGCTATCGTAGTTTATTACAGACGCGTTTAAAACACCCAGATATTTTAAAACATGTAAACTGGGTAGCGGCCTATA 

CGAATGCnTTAGAATGGGAAAACCCTCATTATTCAGGAAAAAAAGGTTGGCAATATACCTCTTCTGAATACATGA 

AAGGAATCCAAGGGCGCGTAGATGTCAGCGTTTGGTATTAA 

MKTKIGLASICLLGLATSHVAANETEVAKTSQDTTTASSSSEQNQSSNKTQTSAEVQTNAAAHWDGDYYVKDDGSKAQ 

SEWIFDNYYKAWFYINSDGRYSQNEWHGNYYLKSGGYMAQNEWIYDSNYKSWFYLKSDGAYAHQEWQLIGNKWYY 

FKKWGYMAKSQWQGSYFLNGQGAMMQNEWLYDPAYSAYFYLKSDGTYANQEWQKVGGKWYYFKKWGYMARNE 

WQGNYYLTGSGAMATDEVIMDGTRYIFAASGELKEKKDLNVGWVHRDGKRYFFNNREEQVGTEHAKKVIDISEHNG 

RINDWKKVIDENEVDGVIVRLGYSGKEDKELAHNIKELNRLGIPYGVYLYTYAENETDAESDAKQTIELIKKYNMNLSY 

PIYYDVENWEYVNKSKRAPSDTGTWVKIINKYMDTMKQAGYQNVYVYSYRSLLQTRLKHPDILKHVNWVAAYTNAL 

EWENPHYSGKKGWQYTSSEYMKGIQGRVDVSVWYZ 

ID52 774bp 

ATGAAAAAATT^ 

TGCCTTTAATGCTGGTGATOATATGAATAGCTTTACAGGTTTTAGCT 

GGGAG ACTCATGCTGATTTTGGCTCAGACAri l l'l Cri GGCCTTCCTATCAGCCTTGATAGCGACCATTATCGGGA 
C 1 " I I'GGTGCC ATTTAC ATCTACCAGTCTCGTAAG A AATACCAAG AAGCCTTTCTATCACTCAATAAT ATCCTCAT 
GGTTGCGCCTGACGTTATGATTGGTGCTAGCTTCTTGATTCT 

CCGTTCTATCTAGTCACGTGGCCTTCTCCATTCCTATCGTGGTCTTGATGGTCTTGCCTCGACTCAAGGA 

TGGCGACATGATTCATGCGGCCTATGACTTGGGAGCTAGTCAATTTCAGATG 

CTGACTCCGTCT ATCATT ACTGGTTATTTCATGGCCTTCAC 

AACAGGAAATGGCTTT TCAAC C CTATC AGTCGAGATTTACTCTCGTGCTC 

GCCCTGTCTGCTCTAGTCTTTCTCTTTAGTATTATCCTAGTTGTAGGTTATTACTTTATCT 

GCAAGCATGA 



MKKFANLYLGLVFLVLYLPIFYLIGYAFNAGDDMNSFTGFSWTHFETMFGDGRLMLILAQTFFLAFLSALIATIIGTFGA 
IYIYQSRKKYQEAFLSLNNILMVAPDVMIGASFLILFTQLKFSLGFLTVLSSHVAFSIPIVVLMVLPRLKEMNGDMIHAAY 
DLGASQFQMFKEIMLPYLTPSIITGYFMAFTYSLDDFAVTFFVTGNGFSTLSVEIYSRARKGISLEINALSALVFLFSIILVV 
GYYFISREKEEQAZ 

IP59 1071 bo 

ATGAAAAAAj\TCTATTCATTTTTAGCAGGAATTGC^ 

ATAGTAAAATCAATAGTCGAGATAGTCAAAAATTGGTTATCTATAACTGGGGAGACTATATCGATCCTGAACTCTT 

GACTCAGTTTACAGAAGAAACAGGAATTCAAGTTCAGTACGAGACTTTTGACTCCAACGAAGCCATGTACACTA 

GATAAAGCAGGGTGGAACGACCTACGATATTGCCATTCCAAGTGAATACATGATTAACAAGATGAAGGACGAAG 

ACCTCTTGGTTCCGCTTGATTATTCAAAAATTGAAGGAATCGAAAATATCGGACCAGAGTTTCTCAACCAGTCCTT 

TGACCCAGGTAATAAATTCTCCAT CCCTT ACTTCTGGGGAACCTTAGGAATTGTCTACAACGAAACCATGGTAGAT 

GAAGCGCCTGAGCATTGGGATGACCTTTGGAAGCCGGAGTATAAGAATTCTATCATGCTCTTTGATGGGGCGCGT 

GAGGTGCTGGGACTAGGACTCAATTCCCTCGGCTACAGCCTCAACTCCAAGGATCrGCAGCAGTTGGAAGAGACA 

GTGGATAAGCTCTACAAACTGACTCCAAATATCAAGGCTATCGTTGCGGACGAGATGAAGGGCTATATGATTCAG 

AATAATGTTGCAATCGGCGTGACC ITCTC TGGTGAAGCCAGCCAAATGTTAGAAAAAAATGAAAATCTACGTTAT 

GTGGTACC GACA GAGG CCAG CAATCTrTGGTTTGACAATATGGTCATTCCCAAAACAGTTAAAAACCAAAACTCA 

GCCTATGCCTTTATCAACITTATGTTGAAACCTG^ 

CAAACCTACCAGCGAAGGAATTGCTCCCAGAGGAAACAAAGGAAGATAAGGCCTTCTATCCCGATGTTGAAACCA 

TGAAACACCTAGAAGTTTATGAGAAATTTGACCATAAATGGACAGGGAAATATAGCGACCTCITCCTACAGTTTA 

AAATGTATCGGAAGTAG 

MKKIYSFLAGIAAIILVLWGIATHLDSKINSRJ)SQKLVIYNWGDYIDPELLTQFTEETGIQVQYETFDSNEAMYTKIKQGG 

TTYDIAIPSEYMINKMKDEDLLVPLDYSKIEGIENIGPEFLNQSFDPGNKFSIPYFWGTLGIVYNETMVDEAPEHWDDLW 

KPEYKNSIMLFDGAREVLGLGLNSLGYSLNSKDLQQLEETVDKLYKLTPNIKAIVADEMKGYMIQNNVAIGVTFSGEAS 

OMLEKNENLRYVVPTEASNLWFDNMVIPKTVKNQNSAYAFINFMLKPENALQNAEYVGYSTPNLPAKELLPEETKED 

KAFYPDVETMKHLEVYEKFDHKWTGKYSDLFLQFKMYRKZ 
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ID61 1851bp 

ATGAATAAAAAACTAACAGATTATGTGATTGATCTGGTGGAAATTTTAAATAAACAACAAAAGCAGGTTTTCTGG 
5 GGAATATTTGATATTTTCAGTATGGTGGTTTCCATCATTGTATCrrATATTTTATT^ 
ACCTGTTGACrACATTATCTATACGAGTTTGGCCITCCrGTT 

CGAGCATTAGTCGTTACAGCAAGATTACGGATTTCATGAAAATCTTTTTTGGTGTGACTGCTAGCAGTGTCTTGTC 
AT ATAG TATCTGTTATGCOTCrrGCCACTCITCT^ 

GATTTTATTGCCACGGATTACTTGGCAGTTAATCTACTCCAGACGCAAAAAAGGTAGTGGTGATGGAGAACACCG 

10 TCGGACCTTCTTGATTGGTGCCGGTGATGGTGGGGCTCl TITI ATGGATAGTTACCAACATCCAACCAGTGAATTA 

GAACTGGTCGGTATTTTGGATAAGGATTCTAAGAAAAAGGGTCAAAAACTTGGTGGTATTCCTGTTTTGGGCTCTT 
ATGACAATCTGCCTGAATTAGCCAAACGCCATCAAATCGAGCGTGTCATCGTTGCGATTCCGTCGCTGGATCCGTC 
AGAATATGAGCGTATCTTGCAGATGTGTAATAAGCTGGGTGTCAAATGTTACAAGATGCCTAAGGTTGAAACTGT 
TGTTCAGGGCCITCACCAAGCAGGTACrGGCrTCCAAAAAATTGATATTACGGACCTTTTGGGTCGTCAGGAAATC 

15 CGTCTTGACGAATCGCGTCTGGGTGCAGAACTGACAGGTAAGACCATCTTAGTCACAGGAGCTGGAGGTTCAATC 
GGTTCrGAAATCrGTCGTCAAGTTAGTCGCTTCAATCCTGAACGCATTGTCTTGCTCGGTCATGGGGAAAACTCAA 
TCTACCTTGTTTATCATGAATTGATTCGTAAGTTCCAAGGGATTGATTATGTACCTGTGATTGCGGACATTCAAGA 
CTATGATCGTTTGTTGCAAGTCTrrGAGCAGTACAAACCrGCrATTGTTTATCATGCGGCAGCCCACAAGCATGTT 
CCTATGATGGAGCGCAATCCAAAAGAAGCCTTCAAAAACAATATCCGTGGAACTTACAATGTTGCTAAGGCTGTT 

20 GATGAAGCTAAAGTGTCTAAGATGGTTATGATTTCGACAGATAAGGCAGTCAATCCACCAAATGTTATGGGAGCA 
ACC A AG CG CGTGG CGG AGTTG ATTGTC A CTG G CTTT AA C C A A CG T A G CCAATCAAGCTAGTGTGGAGTTGG TTFT G 
GGAATGTTCrrGGTAGCCGTGGTAGTGTCATrCCAGTCITTGAACGTCAGATTGCrGAAGGTGGGCCTGTAACGGT 
GACAGACTTCCGTATGACCCGTTACITTATGACCATTCCAGAAGCTAGCCGTCrGGTTATCCATGCTGGTGCTTAT 
GCCAAAGATGGGGAAGTCTTTATCCTTGATATGGGCAAACCAGTCAAGATTTATGACTTGGCCAAGAAGATGGTG 

25 CTTCTAAGTGGCCACACTGAAAGTGAAATTCCAATCGTTGAAGTTGGAATCCGCCCAGGTGAAAAACTCTACGAA 
GAACT CnTGGTATCAACCGAACTCGTTGATAATCAAGTTATGGATAAGATTTTCGTTGGTAAGGTTAATGTCATGC 
CTTTAGAATCCATCAATCAAAAGATrGGAGAGTrCCGCACTCTCAGTGGAGATGAGTTGAAGCAAGCTATTATCG 
CCTTTGCTAATCAAACAACCCACATTGAATAA 

30 MNKKLTDYVIDLVEILNKQQKQVFWGIFDIFSMVVSIIVSYILFYGLINPAPVDYIIYTSLAFLFYQLMIGFWGLNASISRY 
SKITDFMKIFFGVTASSVLSYSICYAFLPLFSIRFIILFILLSTFLILLPRITWQLIYSRRKKGSGDGEHRRTFLIGAGDGGALF 
MDSYQHPTSELELVGILDKDSKKKGQKLGGIPVLGSYDNLPELAKRHOIERVIVAIPSLDPSEYERILOMCNKLGVKCYK 
MPKVETVVQGLHQAGTGFQKIDITDLLGRQEIRLDESRLGAELTGKTILVTGAGGSIGSEICRQVSRFNPER1VLLGHGEN 
SIYLVYHELIRKFQGIDYVPVIADIQDYDRLLQVFEQYKPAIVYHAAAHKHVPMMERNPKEAFKNNIRGTYNVAKAVD 
35 EAKVSKMVMISTDKAVNPPNVMGATKRVAELIVTGFNQRSQSTYCAVRFGNVLGSRGSVIPVFERQIAEGGPVTVTDFR 
MTRYFMTIPEASRLVIHAGAYAKDGEVFILDMGKPVKIYDLAKKMVLLSGHTESEIPIVEVGIRPGEKLYEELLVSTELV 
DNQVMDKIFVGKVNVMPLESINQKIGEFRTLSGDELKQAHAFANQTTHIEZ 

ID101 1338bp 

ATGATTGAACirTATGATAGTTACAGTCAAGAAAGTCGAGATTTACATGAAAGTCTAGTCGCT^ 
AACTTGGAGTGGTCATCGATGCAGATGGTTlTCTGCCTGATGGTCTGCTTTCrCCrTTTACCT 
GAGGATGGAAAACCTCTCTATTITAATCAAGTTCCCGTrTCAGATTTITGGGAAATITTA 

CTTGTATTGAAGATGTGACGCAGGAGAGGGCTGTCATTCATTATGCTGATGGAATGCAGGCTCGCTTGGTTAAACA 
GGTAGACTGGAAAGACCTAGAAGGTCGAGTACGTCAGGTTGACCACTACAATCGCITCGGAGCTTGTTrTGCTAC 
AACGACrTATAGCGCAGATAGCGAGCCGATTATGACAGTTTACCAAGATGTCAATGGTCAACAAGTTTTACTGGA 
AAACCAT GTGACGG GTGATATCTTATrGACTTTGCCAGGTCAGTCCATGCGTTACITTGCAAATAAAGTrGAATTT 
ATCACCT lCriMM-lGCAAGATTTGGAAATAGATACCAGTCAGCTrATCTTTAATACTCTAGCGACTC Ll rTCTT GGT 
TTCCTTCCATCAT CCAG ATAAATCTGGCTCGGATGTCTTGGTATGGCAGGAACCTCTCTATGATGCCATTCCAGGT 
AATA TGCA GTTGATITTGGAAAGTGATAATGTGCGTACTAAGAAGATCATCATTCCAAATAAGGCGACTTATGAG 
CGCGCrTTAGAGTTAACTGACGAGAAATACCATGATCAGTTTGTGCACrTGGGTTATCATTACCAGTTCAAACGTG 
ATAATTTCCTAAGACGAGATGCCrTAATCITGACCAATTCAGATCAGATTGAGCAAGTAGAAGCAATCGCAGGAG 
CCirGCCrG ATGT CACTTTCCGTATTGCAGCGGTGACAGAGATGTCrTCTAAGCTCTTAGACATGCTTTGCTATCCT 
AATGTGGCCCTTTACCAGAACGCTAGTCCACAGAAGATTCAGGAGCTGTATCAACTGTCGGATATTTACTTGGATA 
TAAACCACAGTAATGA GTTG CTACAGGCAGTGCGTCAGGCCTrTGAGCACAATCTCTTGATTCTTGGCTTTAATCA 
GACGGTGCACAA TAGA CTTTATATCGCTCCAGACCATCTATITGAAAGTAGTGAAGTTGCTGCTTTGGTTGAGACC 
ATTAAATTGGCCCTTTCAGATGTTGATCAAATGCGTCAGGCACTTGGCAAACAAGGCCAACATGCAAATTATGTTG 
ACTTGGTGAGATATCAGGAAACCATGCAAACTGTTTTAGGAGGCTAA 

60 MIELYDSYSQESRDLHESLVATGLSOLGVVIDADGFLPDGLLSPFTYYLGYEDGKPLYFNQVPVSDFWEILGDNQSACIE 
DVTQERAVIHYADGMQARLVKQVDWKDLEGRVROVDHYNRFGACFATTTYSADSEPIMTVYODVNGQOVLLENHV 
TGDILLTLPGQSMRYFANKVEFITFFLQDLEIDTSQLIFNTLATPFLVSFHHPDKSGSDVLVWQEPLYDAIPGNMQLILES 
DNVRTKKIIIPNKATYERALELTDEKYHDQFVHLGYHYOFKRDNFLRRDALILTNSDQIEQVEAIAGALPDVTFRIAAVT 
EMSSKLLDMLCYPNVALYQNASPQKIOELYQLSDIYLDINHSNELLQAVRQAFEHNLLILGFNQTVHNRLYIAPDHLFE 

65 SSEVAALVETIKLALSDVDQMROALGKQGQHANYVDLVRYQETMQTVLGGZ 



40 
45 
50 

55 



SUBSTITUTE SHEET (RULE 26) 



WO 00/06738 



PCT/GB99/02452 



56 



ID102 1512bp 

ATGACAATTTACAATATAAATTTAGGAATTGGTTGGGCTAGTAGCGGTGTTGAATACGCTCAAGCCTATCGTGCT 
5 GTGTTTTTCGGAAATTAAATCTGTCCTCTAAGTTTATCTTTAC 

ACAGCCAATATTGGTTTTGATGATAATCAGGTTATCTGGCTTTATAATCATTTCACAGATATCAAAATTGCAC 
CTAGCGTGACAGTGGATGATGTCTTGGCTTACTTTGGTGGTGAAGAAAGTCACAGAGAAAAAAATGGCAAGGTTT 
TACGTGTATTCTTTTTTGACCAAGATAAGTTTGTAACCTGTTATTTGGTTGATGAGAAC^ 
TGCCGAGTATGTTTTTAAGGGAAACCTGATTCGGAAGGATTACTTTTCTTATACGC^ 

10 GCTCCCAAGGACAATGTTGCAGTCTTATACCAACGAACTTTTTATAATGAAGACGGGACTCCAGTCTATGATATCT 
TGATGAATCAAGGGAAGGAAGAAGTTTATCATTTCAAGGATAAGATTTTCTATGGAAAGCAAGC' 1 'I H GTGCGTG 
CCTTTATGAAATCTTTGAATTTGAATAAGTCTGATTTGGTCATTCTCGATAGGGAGACAGGT^ 
GTTTGAGGAAGCACAGACAGCACATCTAGCGGTAGTTGTTCATGCGGAGCATTATAGTGAAAATGCTACAAATGA 
GGACTATATCCTTTGGAATAACTATTATGACTATCAGTTTACCAATGCAGATAAGGTTGA C1 TCI 1 T ATCGTGTCT 

15 ACTGATAGACAAAATGAAGTTCTACAAGAGCAATTTGCCAAATATACTCAGCATCAGCCAAAGATTGTTACCATT 
CCTGTAGGCAGTATTGATTCCTTGACAGATTCAAGTCAAGGGCGCAAACCAT^ 

TTGCCAAAGAj\AAGCACATTGATTGGCTTGTGAAAGCTGTGATTGAAGCTCATAAGGAGTTACCGGAACTAACCT 
TTGATATCTATGGTAGTGGTGGAGAAGATTCTCTGCTTAGAGAAATTATTGCAAATCATCAGGCAGAGGACTATAT 
CCAACTCAAGGGGCATGCGGAACTTTCGCAGATTTATAGCCAGTATGAGGTCTACTTAACGGCnTCrACCAGCG^ 
20 AGGATTTGGTCTGACCITGATGGAAGCTATTGGTTCAGGTCTACCTCT 

AGACCTTTATAGAGGATGGGCAAAATGGTTATTTGATTCCAAGTTCATCTGACCATGTAGAAGACCAAATCAAGC 
AAGCTTATGCCGCTAAGATTTGTCAATTGTATCAAGAAAATCGTTTC 

TGCAGAAGGCTTCTTGACCAAAGAAATTTTAGAAAAGTGGAAGAAAACAGTAGAGGAGGTGCTCCATGATTGA 

25 MTIYNINLGIGWASSGVEYAQAYRAGVFRKLNLSSKnFTDMILADNIQHLTANIGFDDNQVIWLYNHFTDIKIAPTSVT 
VDDVLAYFGGEESHREKNGKVLRVFFFDQDKFVTCYLVDENKDLVQHAEYVFKGNLIRKDYFSYTRYCSEYFAPKDN 
VAVLYQRTFYNEDGTPVYDILMNQGKEEVYHFKDKIFYGKQAFVRAFMKSLNLNKSDLVILDRETGIGQVVFEEAQTA 
HLAVVVHAEHYSENATNEDYILWNNYYDYQFTNADKVDFFIVSTDRQNEVLQEQFAKYTQHQPKIVTIPVGSIDSLTDS 
SQGRKPFSLITASRLAKEKHIDWLVKAVIEAHKELPELTFDIYGSGGEDSLLREIIANHQAEDYIQLKGHAELSQIYSQYE 

30 VYLTASTSEGFGLTLMEAIGSGLPLIGFDVPYGNQTFIEDGQNGYLIPSSSDHVEDQIKQAYAAKICQLYQENRLEAMRA 
YSYQIAEGFLTKEILEKWKKTVEEVLHDZ 



35 

IP103 2292bp 

ATGTCCTCTCTTTCGGATCAAGAATTAGTAGCTAAAACAGTAGAGTTTCGTCAGCGTCITTCCGAGGGAGAAAGTC 

40 TAGACGATATTTTGGTTGAAGCTTTTGCTGTGGTGCGTGAAGCAGATAAGCGGATTTTAGGGATGTTTCCTTATGA 
TGTTCAAGTCATGGGAGCTATTGTCATGCACTATGGAAATGTTGCTGAGATGAATACGGGGGAAGGTAAGACCTT 
GACAGCTACCATGCCTGTCTATTTGAACGCTTTTTCAGGAGAAGGAGTGATGGTTGTGACTCCTAA 
TCAAAGCGTGATGCCGAGGAAATGGGTCAAGTTTATCGTTTTCTAGGATTGACCATTGGTGTACCATTTACGGAAG 
ATCCAA AGAAG GAGATGAAAGCTGAAGAAAAGAAGCTTATCTATGCTTCGGATATCATCTACACAACCAATAGTA 

45 ATTTAGGTTTTGATTATCTAAATGATAACCTAGCCTCGAATGAAGAAGGTAAGTTTTTACGACCGTTTAACT 

GATTATTGATGAAATTGATGATATCTTGCTTGATAGTGCACAAACTCCTCTGATTATTGCGGGTTCTCCTCGTC 
AGTCTAATTACTATGCGATCATTGATACACTTGTAACAACCTTGGTCGAAGGAGAGGATTATATCTTTAAAGAGGA 
GAAAGAGGAGGTTTGGCTCACTACTAAGGGGGCCAAGTCTGCTGAGAATTTCCTAGGGATTGATAATTTATACAA 
GGAAGAGCATGCGTCTTTTGCTCGTCATTTGGTTTATGCGATTCGAGCTCATAAGCT 

50 TATATCATTCGTGGAAATGAGATGGTACTGGTTGATAAGGGAACAGGGCGTCTAATGGAAATGACTAAACTTCAA 
GGAGGTCTCCAT CAGGCT ATTGAAGCCAAGGAACATGTCAAATTATCTCCTGAGACGCGGGCTATGGCCTCGATC 
ACCTATCAGAGTCITTTTAAGATGTTTAATAAGATATCTGGTATGACAGGGACAGGTAAGGTCGCGGAAAAAGAG 
TTTATTGAAACT TACA ATATGTCTGTAGTACGCATTCCAACCAATCGTCCGAGACAACGGATTGACTATCCAGATA 
ATCTATA TATCAC TTTACCTGAAAAAGTGTATGCATCCTTGGAGTACATCAAGCAATACCATGCTAAGGGAAATCC 

55 TTTACTCGTTTTTGTAGGCTCAGTTGAAATGTCTCAACTCTATTCGTCTCTCTTGTTTCGTGAAGGGA 

ATGTCCTAAATGCTAATAATGCGGCGCGTGAGGCTCAGATTATCTCCGAGTCAGGTCAGATGGGGGCTGTGACAG 
TGGCTACCTCTATGGCAGGACGTGGTACGGATATCAAGCTTGGTAAAGGAGTCGCAGAGCITGGGGGCTTGATTG 
TTATTGGGACTGAG CGGATG GAAAGTCAGCGGATCGACCTACAAATTCGTGGCCGTTCTGGTCGTCAGGGAGATC 
CTGGTATGAGTAAATTTTTTGTATCCTTAGAGGATGATGTTATCAAGAAATTTGGTCCATCTTGGGTGCATAAAAA 

60 GTACAAAGACTATCAGGTTCAAGATATGACTCAACCGGAAGTATTGAAAGGTCGTAAATACCGGAAACTAGTCGA 
AAAGGCTCAGCATGCCAGTGATAGTGCTGGACGTTCAGCACGTCGTCAGACTCTGGAGTATGCTGAAAGTATGAA 
TATACAACGGGATATAGTCTATAAAGAGAGAAATCGTCTAATAGATGGTTCTCGfGACTTAGAGGATGTTGTTGTG 
G AT ATC ATTG AG A G AT AT A C A G A A GAG G TAG CGG CTG ATC A CT ATG CT AGTCGTG A ATT ATTGTTTC A CTTT ATTG 
TGACCAATATTAGTTTTCATGTTAAAGAGGTTCCAGATTATATAGATGTAACTGACAAAACTGCAGTTCGTAGCTT 

b5 TATGAAGCAGGTGATTGATAAAGAACTTTCTGAAAAGAAAGAATTACTTAATCAACATGACTTATATGAACAGTT 



SUBSTITUTE SHEET (RULE 26) 



WO 00/06738 



PCT/GB99/02452 



57 



TTTACGACTTTCACTGCTTAAAGCCATTGATGACAACTGGGTAGAGCAGGTAGACTATCTACAACAGCTATCCATG 
GCTATCGGTGGTCAATCTGCTAGTCAGAAAAATCCAATCGTAGAGTACTATCAAGAAGCCTACGCGGGCTTTGAA 
GCTATGAAAGAACAGATTCATGCGGATATGGTGCGTAATCTCCTGATGGGGCTGGTTGAGGTCACTCCAAAAGGT 
GAAATCGTGACTCATTTTCCATAA 

MSSI^DQELVAKTVEFRQRLSEGESLDDILVEAFAVVREADKRILGMFPYDVQVMGAIVMHYGNVAEMNTGEGKTLT 
ATMPVYLNAFSGEGVMVVTPNEYLSKRDAEEMGQVYRFLGLTIGVPFTEDPKKEMKAEEKKLIYASDIIYTTNSNLGF 
DYLNDNLASNEEGKFLRPFNYVIIDEIDDILLDSAQTPLIIAGSPRVQSNYYAIIDTLVTTLVEGEDYIFKEEKEEVWLTTK 
GAKSAENFLGIDNLYKEEHASFAlUlLVYAIRAHKLFTKDKDYnRGNEMVLVDKGTGRLMEMTKLQGGLHOAIEAKEH 

10 VKLSPETRAMASITYQSLFKMFNKISGMTGTGKVAEKEFIETYNMSVVRIPTNRPRQRIDYPDNLYITLPEKVYASLEYIK 
QYHAKGNPLLVFVGSVEMSOLYSSLLFREGIAHNVLNANNAAREAQIISESGQMGAVTVATSMAGRGTDIKLGKGVAE 
LGGLIVIGTERMESQRIDLQIRGRSGRQGDPGMSKFFVSLEDDVIKKFGPSWVHICKYKDYQVQDMTOPEVLKGRKYRK 
LVEKAQHASDSAGRSARROTLEYAESMNIQRDIVYKERNRLIDGSRDLEDVVVDIIERYTEEVAADHYASRELLFHFIVT 
NISFHVKEVPDYIDVTDKTAVRSFMKQVIDKELSEKKELLNQHDLYEQFLRLSLLKAIDDNWVEQVDYLQQLSMAIGG 

15 QSASQKNPIVEYYQEAYAGFEAMKEQIHADMVRNLLMGLVEVTPKGEIVTHFPZ 

ID104 879bp 

ATGAAACAAGAATGGTTTGAAAGTAATGATTTTGTAAAAACAACAAGCAAGAACAAGCCTGAAGAGCAAGCTCA 
20 AGAGGTTGCAGACAAGGCTGAAGAAAGGATACCCGATCTCGATACACCAATTGAAAAAAATACTCAGTTAGAGG 
AGGAAGT C TCTCAAGCTGAAG T CGAATTGGAA AGCCAGCAAGAAGAGAAAATTGAAGCTCCTGAAr.ArAnTnAA 
GCGAGAACAGAAATAGAAGAAAAGAAGGCATCTAATTCTACTGAAGAAGAGCCAGACCTTTCTAAAGAAACAGA 
AAAAGTCA CTAT AGCTGAAGAGAGCCAJVGAAGCTCTTCCTCAGCAAAAAGCAACCACGAAAGAGCCACTTCTTAT 
CAG TAAAT CTTTAGAAAGTCCrTATATCCCCGACCAAGCTCCAAJ^ATCTAGGGATAAATGGAAAGAGCAAGTGCT 
25 TG ATT TTTG G TCTTG G CT AGTGG A AG CG ATC A AATCTC CT A C A AGT AAGTTG G AA AC A AGT ATC AC AC A C AGTTAC 

ACAGCCTTTCTCTTGCrCATTCTGTTTTCTGCATCTTCCT T 1 ITCTTI AGTATCTATCAC ATCAAAC ATGCTT ACTAT 
GGACATATAGCAAGCATTAACAGTCGCITCCCT^ 

AGTAGCGACAACACTCTTCITCTTTTCATTCCTCTTGGGTAGTTTCGTTGTGAGACG 
GA CTGGACGCTAGAC AAGGTTCTCCAACAATATAGTCAACTCTTGGCA^TTCCAATCTCCT 
30 TTTCTTTGCITTCriTGATAGCCTACGATTTACAGCCCrCITGTGTGTGA 

MKQEWFESNDFVKTTSKNKPEEQAQEVADKAEERIPDLDTP1EKNTQLEEEVSQAEVELESQQEEKIEAPEDSEARTEIE 
EKKASNSTEEEPDLSKETEKVTIAEESOEALPOQKATTKEPLLISKSLESPYIPDQAPKSRDKWKEQVLDFWSWLVEAIKS 
PTSKLETSITHSYTAFLLLILFSASSFFFSIYHIKHAYYGHIASINSRFPEQLAPLTLFSIISILVATTLFFFSFLLGSFVVRRFIH 
35 QEKDWTLDKVLQQYSQLLAIPISSLLLLVSLLSLIAYDLQPSCVZ 

ID106 327bp 

ATGTACTTTCCAACATCCTCTGCCITGATTGAATTTCTCATCITGGCTGTA^ 
40 TGAGATTAG CCAAJ ^CCATTAAGCrGATCGCTAATATCA^AGA^ 

AGGCAATAGCTTTCTGACAACCTATTCTAGAGAGTTCCAAGGTCGCATGCGCAAATACTACrCCTTGACAAACGG 
TGGTATAGAGCAGCTCTTGACCCTAAAAGATGAATGGGCACTCTATACAGACACCATCAATGGCATCATAGAAGG 
GAGTATCCGCCATGACAAGAACTGA 

45 MYFPTSSALIEFLILAVLEQGDSYGYEISQTIKLIANIKESTLYPILKKLEGNSFLTTYSREFQGRMRKYYSLTNGGIEOLLT 
LKDEWALYTDTINGIIEGSIRHDKNZ 

rD108 9S4bp 

50 ATGGATTTTGAAAAAATTGAACAAGCTTATATCTATTTACTAGAGAATGTCCAAGTCATCCAAAGTGATTTGGCGA 
CCAACI i i IATGACGCCTTGGTGGAGCAAAATAGCATCTATCTGGATGGTGAAACTGAGCTAAACCAGGTCAAAG 
ACAACAATCAGGCCCTTAAGCGTTTAGCACTACGCAAAGAAGAATGGCTCAAGACCTACCAGTTTCTCTTGATGA 
AGGCTGGGCAAACAGAACCCTTGCAGGCCAATCACCAGTTTACACCGGATGCTATTGCnTTGCnTTTGGTGT 
TGTGGAAGAGTTGTTTAAAGAGGAGGAAATTACTATCCTCGAAATGGGTTCTGGGATGGGAATTCTAGGCGCTAT 

33 ITTC1 lGACCTCGCTTACrA^AAj\GGTGGATTACTTGGGAATGGA^GTGGATGATTTGCTGATTGATCrGGCAGCT 

AGCATGGCAGATGTAATTGGTTTGCAGGCTGGCTTTGTCCAAGGAGATGCCGTTCGCCCACAAATGCTCAAAGAA 
AGCGATGTGGTCATCAGTGACTTGCCTGTCGGCTATTATCCTGATGATGCCGTTGCGTCGCGCCATCAAGTTGCTT 
CT A GCC A A G AACATACTTACGCCC ATC ACTTGCTCATGG A AC A AGGGCTT A AGTACCTCAAGTCAGACGGATACG 
CT A nrn CTAGCTCCG AGTG ATTTGTTG ACCAGTCCTCA A AGTG ATTTGTTA AA AG A ATGGCTG A AAG AAG AGGC 

OU GAGTCrGGTTGCTATGATTAGTCTGCCTGA^AATCTCTTTGCTAATGCCAAACAATCTAAGACT 
AGAAGAAAAATGAA^TAGCAGTAGAGCCTTTTGTTTATCCACTTGCTAGCTTGC 
ATTTAAAGAAAATTTTCAAAAATGGACTCAAGGTACTGAAATATAA 

MDFEKIEOAYIYLLENVQVIQSDLATNFYDALVEQNSIYLDGETELNQVKDNNQALKRLALRKEEWLKTYQFLLMKA 
OD GQTEPLQANHQFTPDAIALLLVFIVEELFKEEEITILEMGSGMGILGAIFLTSLTKKVDYLGMEVDDLLIDLAASMADVI 
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GLQAGFVQGDAVRPQMLKESDVVISDLPVGYYPDDAVASRHQVASSQEHTYAHHLLMEQGLKYLKSDGYAIFLAPSD 
LLTSPQSDLLKEWLKEEASLVAMISLPENLFANAKQSKTIFILQKKNE1AVEPFVYPLASLQDASVLMKFKENFQKWTQG 
TEIZ 

5 ID110 I902bp 

ATGATTATTTTACAAGCTAATAAAATTGAACGTTCTTTTGCAGGAGAGGTTCTTT^ 

TTGATGAACGAGATCGGATTGCTCTTGTTGGGAAAAATGGTGCAGGTAAGTCTACTCTTTTGAAGATT^ 
AGAAGAGGAGCCAACTAGCGGAGAAATCAATAAGAAAAAAGATATTTCTCTGTCTTACCTAGCCCAAGATAGCCG 

10 TTTTGAGTCTGAAAATACCATCTACGATGAAATGCTrCATGTCTTTAATGATTTGCGTCGGACGGAGAGACAACT 
CGTCAGATGGAGCTGGAGATGGGTGAAAAGTCTGGTGAGGATTTGGATAAACTGATGTCAGATTATGACCGCTTA 
TCTGAGAATTTTCGCCAAGCAGGTGGCirTACCTATGAAGCTGATATTCGAGCGATTTTG 
ACGAGTCTATGTGGCAGATGAAAATTGCTGAGCTTTCTGGTGGTCAAAATACTCGTTTGGCACTTGCCAAA 
CCTTGAAAAGCCCAATCTCTTGGTCnTGGACGAGCCAACTAACCACTTGGATATTGAAACCATCGCCTGGCTAGA 

15 GAATTACTTGGTAAACTATAGCGGTGCCCTCATTATCGTCAGCCACGACCGTTATTTCTTGGACAAGGTTGCGACA 
ATTACGCTAGATTTGACCAAGCATTCCTTGGATCGCTATGTGGGGAATTACTCTCGTTTTGTCGAA 
AAAAGCTAGTTACTGAGGCAAAAAACTATGAAAAGCAACAGAAGGAAATCGCT^ 

GCAATCTAGTTCGTGCTTCAACGACTAAACGTGCTCAATCTCGCCGTAAACAACTAGAAAAAATGGAGCGTTTGG 
ACAAGCCTGAAGCTGGCAAGAAAGCAGCCAACATGACCTTCCAGTCTGAAAAAACGTCGGGCAATGTTGTTTTGA 
20 CTGTTGAAAATGCAGCTGTTGGCTATGACGGGGAAGTCTTGTCACAACCT 

TGCTGTCGCTATCGTTGGTCCAAATGGTATCGGCAAGTCAACCITTATCAAGTCTATTGTGGACCAGATTCCT^ 

ATCAAGGGAGAAAAGCGCirTGGCGCTAATGTTGAGGTTGGTTACTATGACCAAACCCAAAGCAAGCTGACACCA 

AGTAATACGGTGCTGGATGAACTCTGGAATGATTTCAAACTGACACCAG^^ 

GCCTTCCTTTTCTCAGGAGATGATGTTAAAAAATCAGTCGGCATGCTATCTGGTGGCGAAAAAGCTC^ 

25 TAGCTAAATTGTCTATGGAAAACAATAACTTTTTGATTCT^ 

GGAAGTGCTAGAAAATGCCTTGATTGACTTTGATGGAACCTTGCrGTTTGTCAGTC 
CGTGTGGCAACTCATGTTTTGGAATTGTCTGAGAATGGTTCAACTCTCTACCTTGGAGATTACGACT 
AGAAGAAAGCAACAGCAGAAATGAGTCAGACTGAGGAAGCTTCAACTAGCAATCAAGCAAAGGAAGCAAGTCCA 
GTCAATGACTATCAGGCCCAGAAAGAAAGTCAAAAAGAAGTTCGCAAACTCATGCGACAAATCGAAAGTCTAGA 

30 AGCTGAJ^ATTGAAGAGCTAGAAAGTCAAAGCCAAGCCATTTCTGAACAAATGTTGGAAACAAACGATGCCGACA 
AACTCATGGAATTACAGGCTGAGCTGGACAAAATCAGCCATCGTCAGGAAGAAGCTATGCTTGAGTGGGAAGAAT 
TATCAGAGCAGGTGTAA 

MIILQANKIERSFAGEVLFDNINLQVDERDRIALVGKNGAGKSTLLKILVGEEEPTSGEINKKKDISIJSYLAQDSI^ESEOT 
35 IYDEMLHVFNDLRRTERQLRQMELEMGEKSGEDLDKLMSDYDRLSENFRQAGGFTYEADIRAILNGFKFDESMWQMK 
IAELSGGQNTRLALAKMLLEKPNLLVLDEPTNHLDIETIAWLENYLVNYSGAUIVSHDRYFLDKVATITLDLTKHSLDR 
YVGNYSRPVELKEQKLVTEAKNYEKQQKE1AALEDFVNRNLVRASTTKRAQSRRKQLEKMERLDKPEAGKKAANMTF 
QSEKTSGNVVLTVENAAVGYDGEVLSQPINLDLRKMNAVAIVGPNGIGKSTFIKSIVDQIPFIKGEKRFGANVEVGYYDQ 
TQSKLTPS^VLDELWNDFKLTPEVEIRNRLGAFLFSGDDVKKSVGMLSGGEKARLLLAKLSMENNNFLILDEPTNHL 
40 DIDSKEVLENALIDFDGTLLFVSHDRYF1NRVATHVLELSENGSTLYLGDYDYYVEKKATAEMSQTEEASTSNQAKEAS 
PVNDYQAQKESQKEVRKLMRQIESLEAEIEELESQSQAISEQMLETNDADKLMELQAELDKISHRQEEAMLEWEELSEQ 
VZ 

IDU1 1179bo 

45 . 

ATGAATCGCTATGCAGTGCAGTTGATTAGCCGTGGGGCTATCAATAAAATGGGAAATATGCTCTATGATTATGGA 
AATAGTGTCTGGTTGGCTTCTATGGGGACTATAGGACAGACAGTTTTAGGAATGTATCAGATTTCTGAGCTCGTCA 
CATCTATTCTCGTCAATCCCTTTGGCGGAGTTATTTCAGACCGTTTTTCTCGTCGTAAGATT^ 
CTTGTTTGTGGGATTCTTTGTCTGGCTA'JTrCIT 
50 T AAC ATTGTG C AGG CT ATTG CTTTTG CCTTTTCTCG C AC AG C C A AT A A AG CT ATC AT A ACTG A AGTG GTGG AG A A A 

GATGAGATTGTGATCTATAATTCTCGCTTAGAGCTGGTTTTGCAGGTTGTAG 
CCTTGTTTTACAGTTTGCAAGTCTCCATATGACGCT^ 

TGGCTTTCCTTCCAAAAGAGGAAGCAAAAGTTCAAGAGAAAAAGG C1 I I I ACTGGGAGAGATA T 1 II I GTAGATA 
TCAAGGATGGGTTACACTATATCTGGCATCAGCAAGAAATTrrCTTCCTTTTGCTGGTAGC^ 

55 CTTTTTTGC AGCTTITG A ATTTCTACnTCCCTTTTCG A ATC AGCTTTACGGGTC AG A AGG AGCCT ATG CAAGTATTT 

TAACTATGGGGGCrATTGGTTCCATCATTGGGGCTCTTCTAGCTAGTAAAATTAAAGCTAATATTTATAAT Cl I 1 1 
GATTTTACTGGCTTTGACAGGTGTCGGAGI lTn ATGATGGGATTACCACTTCCAAC1 1 rTCTTTCCTTTTCTGGAA 
ATTTAGTTTGTGAATTGTTTATGACG A I 1 I I I AATATTCAC I I I 11 1 ACTCAAGTACAAACCAAGGTTGAGAGCGAA 
TTTCTTGGAAGAGTACTGAGTACAATTTTTACCrTAGCTATTCTATTTATGCCTAT^ 

60 CTTGCCAAGTGTCCATCTTTATTCrri'CrrGATTATTGGAClTGGAGTTGTAGCCTTATAr 
ATGTTCGAACTCATTTTGAAAAATTGATATAA 

MNRYAVQLISRGAINKMGNMLYDYGNSVWLASMGTIGQTVLGMYQISELVTSILVNPFGGVISDRFSRRKILMTADLV 
CGILCLAISFIRNDSWMIGALIVANIVQAIAFAFSRTANKAIITEVVEKDEIVIYNSRLELVLQVVGVSSPVLSFLVLQFASL 
65 HMTLLLDSLTFFIAFVLVAFLPKEEAKVQEKKAFTGRDIFVDIKDGLHYIWHQQEIFFLLLVASSVNFFFAAFEFLLPFSN 
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QLYGSEGAYASILTMGA1GSHGAL1JVSKIKANIYNLL1LLALTGVGVFMMGLPLPTFLSFSGNLVCELFMTIFNIHFFT0V 
QTKVESEFLGRVLSTIFTLAILFMPIAKGFMTVLPSVHLYSFLIIGLGVVALYFLALGYVRTHFEKLIZ 

ID113 2466bp 

ATGCAAAATCAATTAAATGAATTAAAACGAAAAATGCTGGAATTTTTCCAGCAAAAACAAAAAAATAAAAAATCA 

GCTAGACCTGGCAAGAAAGGTTCAAGTACCAAAAAATCTAAAACCTTAGATAAGTCAGCCATTTTCCCAGCTATT 

TTACTGAGTATAAAAGCCTTATTTAACTTACTCITTGTACTCGGTTTTCT 

CTTTGGGATACGGAGTGGCCTTATTTGACAAGGTTCGGGTGCCTCAGACAGAAGAATTGGTGAATCAGGTCAAGG 

ACATCTCTTCTATTTCAGAGATTACCTATTCGGACGGGACGGTGATTGCTTCCATAGAGAGTGATTTGTTGCGCAC 

TTCTATCTCATCTGAGCAAATTTCGGAAAATCTGAAGAAGGCTATCATTGCGACAGAAGATGAACACTTTAAAGA 

ACATAAGGGTGTAGTACCCAAGGCGGTGATTCGTGCGACCTTGGGGAAATTTGTAGGTTTGGGTTCCTCTAGTGGG 

GGTTCAACCTTGACCCAGCAACTAATTAAACAGCAGGTGGTTGGGGATGCGCCGACCTTGGCTCGTAAGGCGGCA 

GAGATTGTGGATGCTCTTGCCTTGGAACGCGCCATGAATAAAGATGAGATTTTAACGACCTATCTCAATGTGGCTC 

CCTTTGGCCGAAATAATAAGGGACAGAATATTGCAGGGGCTCGGCAAGCAGCTGAGGGAATTTTCGGTGTAGATG 

CCAGTCAGTTGACTGTTCCTCAAGCAGCAl I "J 1 I AGCAGGACTTCCACAGAGTCCCATTACTTACTCTCCTTATGA 

AAATACTGGGGAGTTGAAGAGTGATGAAGACCTAGAAATTGGCnTAAGACGGGCTAAGGCAGTTCTTTACAGTAT 

GTATCGTACAGGTGCATTAAGCAAAGACGAGTATTCTCAGTACAAGGATTATGACCTTAAACAGGACTTTTTACC 

ATCGGGCACGGTTACAGGAATTTCACGAGACTATTTATACTTTACAACTITGGCAGAAGCrCAAGAACGTATG 

GACTATCTAGCTCAGAGAGACAATGTCTCCGCTAAGGAGTTGAAAAATGAGGCAACTCAGAAGTTTTATCGAGAT 

TTGGCAGCCAAGGAAATTGAAAATGGTGGTTATAAGATTACTACT 

CAAAGTGCGGTTGCTGATTATGGCTATCTTTTAGACGATGGAACAGGTCGTGTAGAAGTAGGGAATGTCTTGATG 

GATAACCAAACAGGTGCTATTCTAGGCTTTGTAGGTGGTCGTAATTATCAAGAAAATCAAAATAATCATGCCTTTG 

ATACCAAACGTTCGCCAGCTTCTACTACCAAGCCCTTGCTGGCCTACGGTATTGCTATTGACCAGGGCTTGATGGG 

AAGTGAAACGATTCTATCTAACTATCCAACAAACTTTGCTAATGGCAATCCGATTATGTATGCT 

AACAGGAATGATGACCTTGGGAGAAGCTCTGAACTATTCATGGAATATCCCTGCTTACTGGACCTATCGTATGCTC 

CGTGAAAAGGGTGTTGATGTCAAGGGTTATATGGAAAAGATGGGTTACGAGATTCCTGAGTACGGTATTGAGAGC 

TTGCCAATGGGTGGTGGTATTGAAGTCACAGTTGCCCAGCATACCAATGGCTATCAGACCTTAGCTAATAATGGA 

GTTTATCATCAGAAGCATGTGATTTCAAAGATTGAAGCAGCAGATGGTAGAGTGGTGTATGAGTATCAGGATAAA 

CCGGTTCAAGTCTATTCAAAAGCTACTGCGACGATTATGCAGGGATTGCTACGAGAAGTTCTATCCTCTCGTGTG^ 

CAACAACCTTCAAGTCTAACCTGACrTCTTTAAATCCTACTCrGGCTAATGCAGA 

AACCAACCAAGACGAAAATATGTGGCTCATGCTTTCGACACCTAGATTAACCCTAGGTGGCTGGATTGGGCATGA 

TGATAAtCATTCATTGTCACGTAGAGCAGGTTATTCTAATAACTCTAATTACATGGCTCATCTGGTAAATGCGATT 

CAGCAAGCTTCCCCAAGCATTTGGGGGAACGAGCGCITTGCTTTAGATCCTAGTGTAGTGAAATCGGAAGTCTTG 

AAATCAACAGGTCAAAAACCAGAGAAGGTTTCTGTTGAAGGAAAAGAAGTAGAGGTCACAGGTTCGACTGTTACC 

AGCTATTGGGCTAATAAGTCAGGAGCGCCAGCGACAAGTTATCGCTTTGCTATTGGCGGAAGTGATGCGGATTAT 

CAGAATGCTTGGTCTAGTATTGTGGGGAGTCTACCAACTCCATCCAGCTCCAGCAGTTCAAGTAGTAGTTCTAGCG 

ATAGCAGTAACTCAAGTACTACACGACC I ICTl CTI CAAGGGCGAGACGATAA 

MQNQLNELKRKMLEFFQQKQKNKKSARPGKKGSSTKKSKTLDKSAIFPAILLS1KALFNLLFVLGFLGGMLGAGIALGY 

GVALFDKVRVPQTEELVNQVKDISS1SEITYSDGTV1ASIESDLLRTSISSEQISENLJCKAIIATEDEHFKEHKGVVPKAVIR 

ATLGKFVGLGSSSGGSTLTQQLIKQQVVGDAPTLARKAAEIVDALALERAMNKDEILTTYLNVAPFGRNNKGQNIAGA 

RQAAEGIFGVDASQLTVPQAAFLAGLPQSPITYSPYENTGELKSDEDLEIGLRRAKAVLYSMYRTGALSKDEYSQYKDY 

DLKODFLPSGTVTGISRDYLYFTTLAEAQERMYDYLAORDNVSAKELKNEATOKFYRDLAAKEIENGGYKITTTIDQKI 

HSAMQSAVADYGYLLDDGTGRVEVGNVLMDNQTGAILGFVGGRNYQENQNNHAFDTKRSPASTTKPLLAYG1AIDQG 

LMGSETILSNYPTNFANGNPIMYANSKGTGMMTLGEALNYSWNIPAYWTYRMLREKGVDVKGYMEKMGYEIPEYGIE 

SLPMGGGIEVTVAQHTNGYQTLANNGVYHQKHVISKIEAADGRVVYEYODKPVQVYSKATATIMQGLLREVLSSRVTT 

TFKSNLTSLNPTLANADW1GKTGTTNODENMWLMLSTPRLTLGGWIGHDDNHSLSRRAGYSNNSNYMAHLVNAIQQA 

SPSIWGNERFALDPSVVKSEVLKSTGQKPEKVSVEGKEVEVTGSTVTSYWANKSGAPATSYRFAIGGSDADYQNAWSSI 

VGSLPTPSSSSSSSSSSSDSSNSSTTRPSSSRARRZ 

1DU4 1974bp 

AT GAAAA AATTTTATGTAAGTCCAATTTTTCCTATTCTAGTAGGATTG 

TAi rrriGTTAATAATAATCTGTTGACGGTTTTAATTTTG I l TC TTTT T GTAGGAGGCTATG TTT1TTT ATTTAAGAA 

ACTGAGAGTGCATTATACAAGGAGTGATGTAGAACAGATACAGTATGTAAACCACCAAGCGGAAGAAAGTTTGAC 

AGCTCTATTGGAA CAGA TGCCTGTAGGTGTTATGAAATTGAATTTATCTTCTGGAGAGGTTGAGTGGTTTAATCCC 

TATGCTGAATTGATTTTGACCAAGGAAGATGGTGATTTTGATTTAGAAGCTGTTCAAACGATTATCAAGGCTTC 

TAG GAAA TCCGTCTACTTATGCCAAGCTTGGTGAGAAGCGTTATGCTGTTC 

GTATTTTGTAGATGTATCCAGGGAACAAGCCATAACAGATGAATTGGTAACAAGTAGACCAGTGATTGGGATTGT 

CT CTGT GGATAATTA TGAT GATTTGGAGGATGAAACrTCTGAGTCAGATATTAGTCAAATCAATAGTTTTGTAGCT 

AATTTTATATCAGAGTTTTCAGAAAAACACATGATGTTTTCTCGTCGGGTAAGTATGGATCGATTTTATCT 

TGACTACACGGTGCTTGAGGGCTTGATGAATGATAAATTTTCTGTTATTGATGCnTTCAGAGAAGAGTCGA 

AGACAGTTGCCCTTGACCTTAAGTATGGGA1 1 1 I LI I ATGGCGATGGAAATCATGATGAGATAGGGAAAGTTGCTT 

TGCTCAATTTGAACTTGGCTGAAGTACGTGGTGGCGACCAGGTGGTTGTTAAGGAAAACGACGAAACGAAAAATC 
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CAGTTTATTTTGGTGGTGGGTCTGCTGCTTCAATCAAGCGTACACGGACTCGTACGCGCGCTATGATGACAGCTAT 
TTCAGATAAGATTCGGAGTGTAGATCAGG1 I 1 I"J GTAGTCGGTCACAAAAATTTAGACATGGATGCTTTGGGCTCT 
GCTGTAGGTATGCAGTTGTTCGCCAGCAATGTGATTGAAAATAGCTATGCTCTTTATGATGAAGAACAAATGTCTC 
CAGATATTGAACGAGCTGTTTCATTCATAGAAAAAGAAGGAGTTACGAAGTTGTTGTCTGTTAAGGATGCAATGG 
5 GGATGGTGACCAATCGTTCTTTGTTGATTCTTGTAGACCAT^ 

TGATTTATTTACCCAAACCATTGTTATTGACCACCATAGAAGGGATCAGGATTTTCCAGATAATGCGGTTATTACT 
TATATCGAj\AGTGGTGCAAGTAGTGCCAGTGAGTTGGTAACGGAATTGATTCAGTTCCAGAATTCTAAGAAAAAT 
CGTTTGAGTCGTATGCAAGCAAGTGTCTTGATGGCTGGTATGATGTTGGATACTAAAAATTTCACCTCGCGA 
CTAGTCGGACATTTGATGTTGCTAGCTATCTCAGAACGCGCGGAAGTGATAGTATTGCTATCCAGGAAATCGCTGC 
10 GACAGATTTTGAAGAATATCGTGAGGTCAATGAACTTATTTTACAGGGGCGTAAATTAGGTTCAGATGTACTAATA 
GCAGAGGCTAAGGACATGAAATGCTATGATACAGTTGTTATTAGTAAGGCAGCAGATGCCATGTTAGCCATGTCA 
GGTATTGAAGCGAGTTTTGTTCTTGCGAAGAATACACAAGGATTTATCTCTATCTCAGCT 

TGAATGTACAACGGATTATGGAAGAGTTAGGCGGTGGAGGCCACITTAATTTGGCAGCAGCTCAAATTAAAGATG 
TAACCITGTCAGAAGCAGGTGAAAAACTGACAGAAATTGTATTAAATGAAATGAAGGAAAAGGAGAAAGAAGAA 
15 TGA 

MKKFYVSPIFPILVGLIAFGVI^TFIIFVNNNLLTVLILF^ 

QMPVGVMKLNLSSGEVEWFNPYAELILTKEDGDFDLEAVQTI1KASVGNPSTYAKLGEKRYAVHMDASSGVLYFVDVS 
REQAITDELVTSRPVIGIVSVDNYDDLEDETSESDISQINSFVANFISEFSEKHMMFSRRVSMDRFYLFTDYTVLEGLMN 

20 DKFSVIDAFREESKQRQLPLTLSMGFSYGDGNHDEIGKVALLNLNLAEVRGGDQWVKENDETKNPVYFGGGSAASIK 
RTRTRTRAMMTAISDKIRSVDQVFVVGHKNLDMDALGSAVGMQLFASNVIENSYALYDEEQMSPDIERAVSFIEKEGV 
TKLLSVKDAMGMVTNRSLLILVDHSKTALTLSKEFYDLFTQTIVIDHHRRDQDFPDNAVITYIESGASSASELVTELIQFQ 
NSKKNRl^RMQASVLMAGMMLDTKNFTSRVTSRTFDVASYLRTRGSDSIAIQEIAATDFEEYREVNELILQGRKLGSDV 
LIAEAIO^MKCYDTVVISKAADAMLAMSGIEASFVLAKOTQGFISISARSRSKLNVQRJMEELGGGGHFNLAAAQIKDVT 

25 LSEAGEKLTEIVLNEMKEKEKEEZ 

ID115 663bp 

ATG AAGTGCTTGTTATGTGGGCAG ACTATG AAG ACTGTTTT AAC nTI AGTAGTCTCTT ACTTCTG AGG AATG ATG 
30 ACTCTTGTCITTGTTCAGACTGTGATTCTACTTTTGA 

AGAGTTGTCAACAAAGTGTCAAGATTGTCAACITTGGTGTAAAGAGGGAGTTGAAGTCAGTCATAGAGCGATTTT 

TACITACAATCAAGCTATGAAGGATTTTTTCAGTCGGTATAAGTTTGATGGAGACTTCCTG 

GCTTCATTTTTAAGTGAGGAGTTGAAAAAGTACAAAGAGTATCAATTT 

ATGCTAATAGAGGATTTAATCAGGTTGAGGGCTTGGTAGAGGCAGCAGGCITTGAGTATCTGGATTTATTAGAGA 
35 AAAGAGAAGAGAGAGCCAGTTCTTCTAAAAATCGTTCAGAGCGCTTGGGGACAGAACTTC CriUCU4 * r ATTAAAA 
GTGGAGTCACTATTCCTAAAAAAATCCTACTTATAGATGATATCTATACTACAGGAGCAACTATAAATCGTGTTAA 
GAAACTGTTGGAAGAAGCTGGTGCTAAGGATGTAAAAACATTTTCCCTTGTAAGATGA 

MKCLLCGQTMKTVLTFSSLLLLRNDDSCLCSDCDSTFERIGEENCPNCMKTELSTKCQDCQLWCKEGVEVSHRAIFTY 
40 NQAMKDFFSRYKFDGDFLLRKVFASFLSEELKKYKEYQFVVIPLSPDRYANRGFNQVEGLVEAAGFEYLDLLEKREER 
ASSSKNRSERLGTELPFFIKSGVTIPKKILLIDDIYTTGATINRVKKLLEEAGAKDVKTFSLVRZ 

ID116 1299bp 

45 ATGAAAGTAAATTTAGATTATCTCGGTCGTTTATTTACTGAGAATGAATTAACAGAAGAAGAACGTCAGTTGGCG 
GAGAAACTTCCAGCAATGAGAAAGGAGAAGGGGAAACTTTTCTGTCAACGCTGTAATAGTACTATTCTAGAAGAA 
TGGT ATTT GCCCATCGGTGCTTACTATTGTCGAGAGTGCTTGCTGATGAAGCGAGTCAGAAGTGATCAAACTTTAT 
ACTATTTTCCGCAGGAGGATTTTCCAAAGCAAGATGTTCTCAAATGGCGCGGCCAATTAACTCCTTTTCAAGAGAA 
GGTGTCAGAGGGATTGCTTCAAGTAGTAGACAAGCAAAAGCCAACCTTAGTTCATGCGGTAACAGGAGCTGGAAA 

50 GACAGAAATGATTTATCAAGTAGTGGCTAAAGTGATCAATGCGGGTGGTGCAGTGTGTTTGGCTAGTCCTCGCAT 
AGATGTTTGT^GGAGCTGTACAAGCGCCTGCAACAGGATTTTTCTTGCGGGATAGCTTTGCT 
GAACCTTATTTTCGAACACCACTAGTTGTTGCAACAACCCATCAGTTAT^ 

GATAGTGGATGAAGTAGATGCTTTTCCTTATGTTGATAATCCCATGCriTACCACGCTGTCAAGAATAGTGTAAAG 
GAGAATGGATTGAGAATCTTTTTAACAGCGACTTCGACCAATGAGTTAGATAAAAAGGTCCGTTTAGGAGAACTA 

55 AAAAGACTGAATTTACCGAGACGGTTTCATGGAAATCCGTTGATTATTCCAAAA 

ATCGCTACTTAGACAAGAATCGTTTGTCACCAAAGTTAAAGTCCTATATTGAGAAGCAGAGAAAGACAGCTTATC 
CGTTACTCAT TTTTG CnTCAGAAATTAAGAAAGGGGAGCAGTTAGCAGAAATCTTACAGGAGCAATTTCCAAATG 
AGAAAATTGGCTTTGTATCTTCTGTAACAGAGGATCGATTAGAGCAAGTACAAGCTTTTCGAGATGGAGAACT 
CAATACTTATCAGTACGACAATCrTGGAGCGCGGAGTTACCTTCCCTTGTGTGGATGTTTTCGTAGTAGAGGCCAA 

60 TCATCG TTTGT TTACCAAGTCTAGTTTGATTCAGATTGGTGGACGAGTTGGACGAAGCATGGATAGACCGACAGGA 
GATTTGC1 111 CT1 CCATGATGGGTTAAATGCTTCAATCAAGAAGGCGATTAAGGAAATTCAGATGATGAATAAGG 
AGGCTGGTCTATGA 

MKVNLDYLGRLFTENELTEEERQLAEKLPAMRKEKGKLFCQRCNSTILEEWYLPIGAYYCRECLLMKRVRSDQTLYYF 
65 PQEDFPKQDVLKWRGQLTPFQEKVSEGLLQVVDKQKPTLVHAVTGAGKTEMIYQVVAKVINAGGAVCLASPRIDVCL 
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ELYKRLQQDFSCGIALLHGESEPYFRTPLVVATTHQLLKFYQAFDLLIVDEVDAFPYVDNPMLYHAVKNSVKENGLRIF 
LTATSTNELDKKVRLGELKRLNLPRJU^HGNPLIIPKPIWl^DFNRYLDKNRI^PKLKSYIEKQI^TAYPLLIFASEIKKGE 
OLAEILOEOFPNEKIGFVSSVTEDRLEQVQAFRDGELT1LISTTILERGVTFPCVDVFVVEANHRLFTKSSLIQIGGRVGRS 
MDRPTGDLLFFHDGLNASIKKA1KEIQMMNKEAGLZ 

lD117 870bp 

ATGCAAATTCAAAAAAGTTTTAAGGGGCAGTCTCCCTATGGCAAGCTGTATCTAGTGGCAACGCCGATTGGCAAT 
CTAGAT GATAT GACTITTCGTGCTATCCAGACCTTGAAAGAAGTGGACTGGATTGCTGCTGAGGATACGCGCAAT 
10 ACAGGGCITTTGCrCAAGCATTTTGACATTTCCACCAj\^ 

ATTCCTGATTTGATTGGTTTCTTGAAAGCAGGGCAAAGTATTGCTCAGGTCTCTGATGCCGGTTTGCCTAGCATTT 
CAGACCCTGGTCATGATTTAGTTAAGGCAGCTATTGAGGAAGAAATTGCAGTTGTGACAGTTCCAGGTGCCTCTGC 
AGGAATTTCTGCCTTGATTGCCAGTGGTTTAGCGCCACAGCCACATATCri'ri ACGG ri'I'l'I^'ACCGAGAA^ 
GGTCAGCAGAAGCAATTTTTTGGCTTGAAAAAAGATTATCCTGAAACACAGATTTTTT 
15 TAGCAGACACGTTGGAAAATATGTTAGAAGTCTACGGTGACCGCTCCGTTGTCTTGGTCAGGGAATTGACCAAAA 
TCTATGAAGAATACCAACGAGGTACTATCTCTGAGTTATTAGAAAGCATTGCTGAAACGCCACTCAAGGGCGAAT 
GTCTTCTCATTGTTGAGGGTGCCAGTCAGGGTGTGGAGGAAAAGGACGAGGAAGACTTGTTCGTAGAAATTCAAA 
CCCGCATCCAGCAAGGTGTGAAGAAAAACCAAGCTATCAAGGAAGTCGCTAAGATTTACCAGTGGAATAAAAGTC 
AGCTCTACGCTGCCTACCACGACTGGGAAGAAAAACAATAA 



20 



25 
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M£IQKSFKGQSPY_GKLYLyATPIGN 

GFLICAGQSIAQVSDAGLPSISDPGHDLVKAAIEEEIAVVTVPGASAGISALIASGLAPQPHIFYGFLPRKSGQQKQFFGLKK 
DYPETQIFYESPHRVADTLENMLEVYGDRSVVLVRELTKIYEEYQRGTISELLESIAETPLKGECLLIVEGASQGVEEKDE 
EDLFVEIQTR1QQGVKKNQAIKEVAKIYQWNKSQLYAAYHDWEEKQZ 

IDU8 345bp 



ATG AT AAAG A AA G G A A AG G G CTGTTTT ATG G AC AA A AA A G A ATT ATTTG A CG CG CTG G ATG ATTTTTCCC A AC AA 
TTATTGGTAACCITAGCCGATGTGGAAGCCATCAAGAAAAATCrCAAGAGCCTGGTAGAGGAAAATACAGCTCTT 
30 CGCTTGGAAAATAGTAAGTTGCGAGAACGCTTGGGTGAGGTGGAAGCAGATGCTCCTGTCAAGGCCAAGCATGTT 
CGCGAAAGTGTCCGTCGTATTTACCGTGATGGATTTCACGTATGTAATGATTTTTATGGACAACGTCGAGAGCAGG 
ACGAAGAATGTATGTTTTGTGACGAGTTGTTATACAGGGAGTAA 

MIKKGKGCFMDKKELFDALDDFSQQLLVTLADVEAIKKNLKSLVEENTALRLENSKLRERLGEVEADAPVKAKHVRES 
35 VRRIYRDGFHVCNDFYGQRREQDEECMFCDELLYREZ 

ID119 639bp 

ATGT CAAA AGGATTTTTAGTCTCTCTTGAGGGACCAGAGGGAGCAGGCAAGACCAGTGTTTTAGAGGCTCTGCTA 
40 CCAATTTTAG AGGA AAAAGGAGTAGAGGTGTTGACGACCCGTGAACCTGGCGGAGTCTTGATTGGGGAGAAGATT 
CGGGAAGTGATTTTGGATCCAAGTCATACTCAGATGGATGCTAAAACAGAGCTACITCTCTATATTGCCAGTCGCA 
GACAGCATTTGGTGGAAAAAGTTCITCCAGCCCrTGAAGCTGGCAAGTTGGTCATCATGGATCGTTTTATCGATAG 
TTCTGTTGCCrATCAGGGATTTGGTCGTGGCTrAGATATTGAAGCCATTGACTGGCrCAATCAGTTTGC 
GGCCTCAAACCCGATTTGACACTCTATTTrGACATCGAGGTGGAAGAAGGGCTGGCTCGTATTGCTGCTAATAGTG 
45 ACCGCGAGGTTAATCGTTTGGATTTGGAAGGGTTGGACTTGCATAAAAAAGTTCGTCAAGGCTACCIT^ 

GGATAAAGAGGGAAATCGCATTGTCAAGATTGATGCTAGTCTCCCTTTGGAGCAAGTTGTGGAAACTACCAAGGC 
TGTCTTGTTTGACGGAATGGGCTTGGCCAAATGA 

MSKGFLVSLEGPEGAGKTSVLEALLPILEEKGVEVLTTREPGGVLIGEKIREVILDPSHTQMDAKTELLLYIASRRQHLVE 
50 KVLPALEAGKLVIMDRFIDSSVAYQGFGRGLDIEAIDWLNQFATDGLKPDLTLYFDIEVEEGLARJAANSDREVNRLDL 
EGLDLHKKVRQGYLSLLDKEGNRIVKIDASLPLEQVVETTKAVLFDGMGLAKZ 

ID120 408bp 

55 ATGGTAGAACAAAGAAAATCAATTACCATGAAAGATGTTGCTTTAGAAGCAGGAGTTAGTGTTGGAACTGTTTCA 
CGTGTAATTAATAAAGAAAAAGGCATTAAAGAAGTAACITTGAAAAAAGTGGAACAAGCGATTAAAACITTGAAT 
TACATTCCAGA TTACT ACGCTAGAGGAATGAAAAAAAATCGAACAGAAACGATTGCAATCATTGTACCAAGTATC 
TGGCATCCCTTCI 1 II CAGAATTTGCTATGCATGTGGAAAATGAAGTCTATAAGAGAAATAACAAATTACrcrTAT 
GTTCTATCAATGGTACAAATAGAGAGCAAGACTATCTGGAGATGTTGCGTCATAATAAAGTTGATGGAGTGGTTG 
CCATTACCrATAGGCCAATTGAACATTACTTGACGTCAGGAATrCCCTTTGTTAGTATTGACCGCACATACTCAGA 
GATTGCCATTCCTTGTGTTTCA 



65 



MVEQRKSrTMKDVALEAGVSVGTVSRVINKEKGlKEVTLKKVEQAIKTLNYIPDYYARGMKKNRTETIAIIVPSIWHPFF 
SEFAMHVENEVYKRNNKLLLCSINGTNREQDYLEMLRHNKVDGVVAITYRPIEHYLTSGIPFVSIDRTYSEIAIPCVS 
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ID121 285bp 

A TGAA TATATTTAGAACAAAGAATGTTAGTTTAGATAAAACAGAGATGCATAGGCATTTGAAGTTATGGGATTTG 
ATTTTGCTGGGTATCGGAGCCATGGTAGGGACAGGCGTCTTTACAATCACAGGTACTC 
5 GCCCAGCCCTAGTGATTTCAATCGTTATTTCTGCCTTGTGTGTGGGATTATCAGCCCT 

TCGCGAGTACCCGCTACAGGAGGTGCCrATAGTTACCTCTATGCTATCTTAGGAGAATTCCCTGCCTGGTTGGCTG 
GTTGGTTAACCATGATGGAGTTCATGACAGCCATATCAGGCGTAGCTTCGGGTTGGGCAGCTTATTTTAA 

MNIFRTKWSLDKTEMHRHLKLWDLILLGIGAMVGTGVFTTTGTAAATLAGPALVISIVISALCVGLSALFFAEFASRVP 
1 0 ATGGA YSYLYAILGEFPAWLAGWLTMMEFMTAISGVASGWAAYF 

1P124 1311bp 

ATGAAATCAAGAGTAAAGGAAACGAGTATGGATAAAATTGTGGTTCAAGGTGGCGATAATCGTCTGGTAGGAAGC. 

15 GTGACGATCGAGGGAGCA AAAA ATGCAGTCTTACCCTTGTTGGCAGCGACTATTCTAGCAAGTGAAGGAAAGACC 
G TCTT GCAGAATGTTCCGATTTTGTCGGATGTCTTTATTATGAATCAGGTAGTrGGTGGTTT 
ACnTGATGAGGAAGCTCATCTTGTCAAGGTGGATGCTACTGGCGACATCACrGAGGAAGCCCCT^ 
TCAGCAAGATGCGCGCCTCCATCGTTGTATTAGGGCCAATCCTTGCCCGTGTGGGTCATGCCAAGGTATCCATGCC 
AGGTGGTTGTACGATTGGTAGCCGTCCTATTGATCTTCATTTGAAAGGTCTGGAAGCTATGGGGGTTAAGATTAGT 

20 CAGACAGCTGGTTACATCGAAGCCAAGGCAGAACGCTTGCATGGTGCTCATATCTATATGGACTTTCCAAGTGTTG 
GTGCAACGCAGAACTTGATGATGGCAGCGACTCTGGCTGATGGGGTGACAGTGATTGAGAATGCTGCGCGTGAGC 
CTGAGATTGTTGACTTAGCCATTCTCCTTAATGAAATGGGAGCCAAGGTCAAAGGTGCTGGTACAGAGACTATAA 
CCATTACTGGTGTTGAGAAACTTCATGGTACGACTCACAATGTAGTCCAAGACCGTATCGAAGCAGGAACCTTTAT 
GGTAGCTGCTGCCATGACTGGTGGTGATGTCTTGATTCGAGACGCTGTCTGGGAGCACAACCGTCCCTTGATTGCC 

25 AAGTTACTTGAAATGGGTGTTGAAGTAATTGAAGAAGACGAAGGAATTCGTGTTCGTTCTCAACTAGAAAATCT^ 
AAAGCTGTTCATGTGAAAACCTTGCCCCACCCAGGATTTCCAACAGATATGCAGGCTCAATTTACAGCC^ 
CAGTTGCAAAAGGCGAATCAACCATGGTGGAGACAGTTTTCGAAAATCGTTTCCAAACCTAGAAGAGATGCGCCG 
CATG GGCTTGCATTCTGAGATTATCCGTGATACAGCTCGTATTGTTGGTGGACAGCCTTTGCAGGGAGCAGAAGTT 
CTTTCAACTGACCTTCGTGCCAGTGCGGCCTTGATTTTGACAGGTTTGGTAGCACAGGGAGAAACT 

30 AATTGGTTCACITGGATAGAGGTTACTACGGTTTCCATGAGAAGTTGGCGCAGCTAGGTGCTAAGATTCAGCGGAT 
TG AG G C AAGTG ATG AAG ATG AAT A A 

MKSRVKETSMDKIVVQGGDNRLVGSVTIEGAKNAVLPLLAATILASEGKTVLQNVPILSDVFIMNQVVGGLNAKVDFD 
EEAHLVKVDATGDITEEAPYKYVSKMRASIVVLGPILARVGHAKVSMPGGCTIGSRPIDLHLKGLEAMGVKISQTAGYIE 
35 AKAERLHGAHIYMDFPSVGATQNLMMAATLADGVTVIENAAREPEIVDLAILLNEMGAKVKGAGTETITITGVEKLHG 
TTHNVVQDRIEAGTFMVAAAMTGGDVLIRDAVWEHNRPLIAKLLEMGVEVIEEDEGIRVRSQLENLKAVHVKTLPHP 
GFPTDMQAQFrALMTVAKGESTMVETVFENRFQHLEEMRRMGLHSEIIRDTARIVGGQPLQGAEVLSTDLRASAALIL 
TGLVAQGETVVGKLVHLDRGYYGFHEKLAQLGAKIQRIEASDEDEZ 

40 ID125 HOlbp 

ATGTTATTAGCGTCAACAGTAGCCTTGTCATTTGCCCCAGTATTGGCAACTCAAGCAGAAGAAGTTCTTTGGACTG 
CAC GTAG TGTTGAGCAAATCCAAAACGATTTGACTAAAACGGACAACAAAACAAGTTATACCGTACAGTATGGTG 
ATACTTTG AGCA CCATTGCAGA AGCCT TGGGTGTAGATGTCACAGTGCTTGCGAATCTGAACAAAATCACTAATAT 

45 GGACTTGATTTTCCCAGAAACTGTTTTGACAACGACTGTCAATGAAGCAGAAGAAGTAACAGAAGTTGAAATCCA 
AACACCTCAAGCAGACTCTAGTGA AGAAG TGACAACTGCGACAGCAGATTTGACCACTAATCAAGTGACCGTTGA 
TGATCAAACTGTTCAGGTTGCAGACCTTTCTCAACCAATTGCAGAAGTTACAAAGACAGTGATTGCTTCTGAAGAA 
GTGGCACCATCTACGGGCACTTCTGTCCCAGAGGAGCAAACGACCGAAACAACTCGCCCAGTTGCAGAAGAAGCT 
CCTCAGGAAACGACTCCAGCTGAGAAGCAGGAAACACAAACAAGCCCTCAAGCTGCATCAGCAGTGGAAGCAAC 

DU TACAACAAGTTCAGAAGCAAAAGAAGTAGCATCATCAAATGGAGCTACAGCAGCAGTTTCTACTTATCAACCAGA 
AGAAACGAAAGTAATTTCAACAACTTACGAGGCTCCAGCTGCGCCCGATTATGCTGGACTTGCAGTAGCAAAATC 
TGAAAATGCAGGTCTTCAACCACAAACAGCTGCCTTTAAWGAAGAAATTGCTAACTTGTTTGGCATTACATCCT^ 
AGTGGTTATCGTCCAGGAGACAGTGGAGATCACGGAAAAGGTTTGGCTATCGACTTTATGGTACCAGAACGTTCA 
GAATTAGGGGATAAGATTGCGGAATATGCTATTCAAAATATGGCCAGCCGTGGCATTAGTTACATCATCTGGAAA 

33 CAACGTTTCTATGCTCCATTCGATAGCAAATATGGGCCAGCTAACACTTGGAACCCAATGCCAGACCGTGGTAGT 
GTG A C AG AAA ATC ACT ATG ATC ACGTTC ACG TTTC AATG A ATGG AT A A 



60 
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MLLASTVAl^FAPVLATQAEEVLWTARSVEQIONDLTKTDNKTSYTVQYGDTl^IAEALGVDVTVLANLNKITNMDL 
IFPETVLTTTVNEAEEVTEVEIQTPQADSSEEVTTATADLTTNQVTVDDQTVQVADLSQPIAEVTKTVIASEEVAPSTGTS 
VPEEQTTETTRPVAEEAPQETTPAEKQETQTSPQAASAVEATTTSSEAKEVASSNGATAAVSTYQPEETKVISTTYEAPA 
APDYAGLAVAKSENAGLQPOTAAFKKKLLTCLALHPLVVIVQETVEITEKVWLSTLWYQNVQNZGIRLRNMLFKIWPA 
VALVTSSGNNVSMLHSIANMGQLTLGTQCQTVVVZQKITMITFTFQZMD 
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ID126 1281bp 

5 TTGTTTAAGAAAAATAAAGACATTCTTAATATTGCATTGCCAGCTATGGGTGAAAACTTTTTGCAGATC 
GAATGGTGGACAGTTATTTGGTTGCTCATTTAGGATTGATAGCTATTTCAGGGGTTTCAGTAGCT 
CACCATTTATCAGGCGATTTTCATCGCTCTGGGAGCTGCTATTTCCAGTGTTATTTCAAAAAGCAT^ 
GACCAGTCGAAGTTGGCCTATCATGTGACTGAGGCGTTGAAGATTACCT^ 

TGTCCATCTTCGCTGGGAAAGAGATGATAGGACTTTTGGGGACGGAGAGGGATGTAGCTGAGAGTGGTGGACTGT 
10 ATCTATCTTTGGTAGGCGGATCGATTGTTCTCTTAGGTTTAATGACTAGTCTAGGAGCCTTGATTCGTGCAACGC 
TAATCCACGTCTGCCTCTCTATGTTAGTTTTTTATCCAATGCCTTGAAT^ 
TCTGGATATGGGGATAGCTGGTGTTGCITGGGGGACAATTGTGTCTCGTTTGGTTGGT 
AATTAAAACTGCCTTATGGGAAGCCAACTTTTGGTTTAGATAAGGAACTGTTGACCTTGGCTTT 
AGAGCGACTTATGATGAGGGCTGGAGATGTAGTGATCATTGCCTTGGTCGTTTCT71*1GGGACGGAGGCAGTTGCT 
15 GGGAATGCAATCGGAGAAGTCTTGACCCAGTTTAACTATATGCCTGCCTTTGGCGTCGCTACGGCAACGGTCATG 
CTGTTGGCCCGAGCAGTTGGAGAGGATGATTGGAAAAGAGTTGCTAGTTTGA 

TGTTCCTCATGTTGCCCCTGTCCTTTAGTATATATGTCTTGGGTGTACCATTAACTCATCTCTATACGACTGATTCT 
CTAGCGGTGGAGGCTAGTGTTCTAGTGACACTGTTTTCACTACITGGGACCCCTATGACGACAGGAACAGTCATCT 
ATACGGCAGTCTGGCAGGGATTAGGAAATGCACGCCTCCCTTTTTATGCGACAAGTATAGGAATGTGGTGTATCC 
20 GCATTG GGAC AGGATATCTGATGGGGATTGTGCTTGGTTGGGGCTTGCCTGGTATTTGGGCAGGGTCTCTCTTGGA 

T A ATGGTTTTCG CTGGTT ATTTGTAGGCTATGGTTAGG AGGG GTT ATATG AG GTTG A AAG GATAG 

LFKKNKDILNIALPAMGENFLQMLMGMVDSYLVAHLGLIAISGVSVAGNIITIYQAIFIALGAAISSVISKSIGQKDQSKLA 
YHVTEALKITLLl^FLLGFLSIFAGKEMIGLLGTERDVAESGGLYLSLVGGSIVLLGLMTSLGALIRATHNPRLPLYVSFL 

25 snalnilfsslaifvldmgiagvawgtivsrlvglvilwsquclpygkptfgldkelltlalpaagerlmmragdvvha 

LVVSFGTEAVAGNAIGEVLTQFNYMPAFGVATATVMLLARAVGEDDWKRVASLSKQTFWLSLFLMLPLSFSIYVLGVP 
LTHLYTTDSLAVEASVLVTLFSLLGTPMTTGTVIYTAVWQGLGNARLPFYATSIGMWCIRIGTGYLMGIVLGWGLPGIW 
AGSLLDNGFRWLFLRYRYQRYMSLKGZ 

30 ID127 894bp 

GTGGGAAGAATTATCAGAGCAGGTGTAAAGATGGAACATCTTGGAAAAGTATTTCGTGAATTTCGAACAAGTGGA 
AATTATTCrTTAAAGGAAG CAGC AGGCGAATCCTGCrCTACCTCTCAGTTATCTCGCTTTGAGCTTGGGGAGTCTG 
AC CTGGC AGTCrCCCGTTTCTTTGAGATTTTGGATAACATTCATGTAACAATCGAAAATTTCATGGATAAGGCAAG 

J5 GAATTTTCATAATCATGAACATGTGTCTATGATGGCACAGATTATCCCACTTTACTATTCAAACGATATTGCAGGT 
TTTC AAAAGCTTCAAAGAGAACAACTTGAAAAGTCTAAGAGTTCGACGACTCCCCITTATTTTGAGCTGAACTGGA 
TTTTGCTACAAGGTCrGATTTGTCAAAGAGATGCGAGTTATGATATGAAGCAGGATGATTTGGGTAAGGTAGCAG 
ATTATCTCTTCAAAACAGAAGAATGGACCATGTATGAGTTGATTCTTTTCGGTAACCTCTATAGTTTCTACGATGT 
AGACTATGTC ACTCG GATTGGTAGAGAAGTTATGGAGAGGGAGGAATTTTACCAAGAGATTAGTCGCCATAAGAG 

40 ATTAGTGTTGATTTTGGCCCTCAATTGTTACCAGCATTGTTTAGAGCATTCrrCITTTTATAATG^ 

AGGCTTATACAGAGAAGATTATTGACAAAGGTATTAAGCTTTATGAGCGTAATGTTTTCCATTATTTAAAAGGTTT 
TGCCTTATATCAAAAAGGACAGTGTAAAGAAGGCTGTAAGCAGATGCAAGAGGCCATGCATATTTTTGATGTGTT 

aggtcttccagagcaagtagcctattatcaggaacactacgaaaaatttgtcaaaagttaa 

45 V GRIIRAGVKMEHLGKVFR£FRTSGNYSLKEAAGESCSTSQLSRFELGESDLAVSRFFEILDNIHVTIENFMDKARNFHN 
HEHVSMMAQIIPLYYSNDIAGFOKLQREOLEKSKSSTTPLYFELNWILLOGLICQRDASYDMKODDLGKVADYLFKTEE 
WTMYELILFGNLYSFYDVDYVTRIGREVMEREEFYQEISRHKRLVLILALNCYOHCLEHSSFYNANYFEAYTEKIIDKGI 
KLYERNVFHYLKGFALYQKGQCKEGCKQMQEAMHIFDVLGLPEQVAYYQEHYEKFVKSZ 
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TABLE 3 
ID1 1068bp 

ATGTCTAACATTCAAAACATGTCCCTGGAGGACATCATGGGAGAGCGCnTTGGTCGCTACTCCAAGTACATTATTC 

AAGACCGGGCTTTGCCAGATATTCGTGATGGGTTGAAGCCGGTTCAGCGCCGTATTCTTTATTCTATGAATAAGGA 

TAGCAATACTTTTGACAAGAGCTACCGTAAGTCGGCCAAGTCAGTCGGGAACATCATGGGGAATTTCCACCCACA 

CGGGGATTCTTCTATCTATGATGCCATGGTTCGTATGTCACAGAACTGGAAAAATCGTGAGATTCTAGTTGAAATG 

CACGGTAATAACGGTTCTATGGACGGAGATCCTCCTGCGGCTATGCGTTATACTGAGGCACGTTTGTCTGAAATTG 

CAGGCTACCITCTTCAGGATATCGAGAAAAAGACAGTTCCTTTTGCATGGAACTTTG 

CAACGGTCTTGCCAGCAGCCTTTCCAAACCTCTTGGTCAATG 

CATTCCTCCCCATAATTTAGCTGAGGTCATAGATGCTGCAGTTTACATGATTGACCACCCAACTGCAAAGATTGAT 

AAACTCATGGAATTCTTGCCTGGACCAGACTTCCCTACAGGGGCTATTATTCAGGGTCGTGATGAAATCAAGAAA 

GCTTATGAGACTGGGAAAGGGCGCGTGGTTGTTCGTTCCAAGACTGAAATTGAAAAGCTAAAAGGTGGTAAGGAA 

CAAATCGTTATTATTGAGATTCCTTATGAAATCAATAAGGCCAATCTAGTCAAGAAAATCGATGATGTTCGTGTTA 

ATAACAAGGTAGCTGGGATTGCTGAGGTTCGTGATGAGTCTGACCGTGATGGTCTTCGTATCGCTATCGAACTTAA 

GAAAGACGCTA^TACTGAGCrTGTTCTCAACTACTTATTTAAGTACACCGACCT 

ATGGTGGCGATTGACAATTTCACACCTCGTCAGGTTGGATTGTTCCAATCCTGTCTAGCTATATCGCTCACCGTCG 
AGAAGTGA 

MSNIQNMSLEDIMGERFGRYSKYnQDRALPDIRDGLKPVQRRILYSMNKDSNTFDKSYRKSAKSVGNIMGNFHPHGDS 

SIYDAMVRMSQNWKNREILVEMHGNNGSMDGDPPAAMRYTEARLSEIAGYLLQDIEKKTVPFAWNFDDTEKEPTVLP 

AAFPNLLVNGSTGISAGYATDIPPHNLAEVIDAAVYMIDHPTAKIDKLMEFLPGPDFPTGAIIQGRDEIKKAYETGKGRV 

VVRSKTEIEKLKGGKEQIVIIEIPYEINKANLVKKIDDVR^ 

YTDLQINYNFNMVAIDNFTPRQVGLFQSCLAISLTVEKZ 

ID12 684bp 

ATGCCGACATTAGAAATAGCACAAAAAAAACTGGAGTTCATTAAGAAGGCAGAAGAATATTACAATGCCTTGTGT 

ACAAATATACAGTTGAGCGGAGATAAACTAAAAGTAATTTCCGTTACTTCTGTTAACCCTGGGGAAGGAAAAACA 

ACTACrrCCATAAATATAGCATGGTCGTTTGCGCGTGCAGGCTATAAAACTCTTTTGATCGATGG 

ATTCAGTTATGTTAGGAGTTTTTAAATCTCGTGAAAAAATTACAGGGCTAACAGAATTTTTATCTGGGACAG 

TTTATCTCACGGTTTATGTGATACAAATATTGAAAATTTATTTGTAGTTCAATCGGGATCTGTATCACCAAACCCT 

ACAGCCTTGTTACAAAGTAAAAATTTTAATGATATGATTGAAACATTGCGTAAATATTTTGATTATATCATTATTG 

ATACACCGCCTATTGGAj^TTGTTATTGATGCGGCAATTATCACTCAAAAGTGTGATGCGTCCATCTTGGTAACAGC 

AACAGGTGAGGCGAATAAACGTGATATCCAAAAAGCGAAACAACAATTAAAACAAACAGGGAAACTGTTCCTAG 

GAGTTGTTTTAAATAAATTGGATATCTCGGTTAATAAGTATGGAGTTTACGGTTCCTATGGAAATTATGGTAAAAA 

ATAA 

MPTLEIAQKKLEFIKKAEEYYNALCTNIQI^GDKLKVISVTSVNPGEGKTTTSINIAWSFARAGYKTLLIDGDTRNSVML 

GVFKSREKITGLTEFLSGTADLSHGLCDTNIENLFVVQSGSVSPNPTALLQSKNFNDMIETLRKYFDYIIIDTPPIGIVIDAA 

IITQKCDASILVTATGEANKRDIQKAKQQLKQTGKLFLGVVLNKLDISVNKYGVYGSYGNYGKKZ 

ID13 U82bp 

ATGGAGGCAAATATGAAACATCTAAAAACATTTTACAAAAAATGGTTTCAATTATTAGTCGTTATCGTCA 

TTTTTAGTGGAGCCTTGGGTAGTTTTrCAATAACTCAACTAACTCAAAAAAGTAGTGTAAACAACT 

TAGTACrATTACACAAACTGCCTATAAGAACGAAAATTCAACAACACAGGCTGTTAACAAAGTAAAAGATGCTGT 

TGTTTCTGTTATTACTTATTCGGCAAACAGACAAAATAGCGTATTTGGCAATGATGATACTGACACAGATTCTCAG 

CGAATCTCTAGTGAAGGATCTGGAGTTATTTATAAAAAGAATGATAAAGAAGCTTACATCGTCACCAACAATCAC 

GTTATTAATGGCGCCAGCAAAGTAGATATTCGATTGTCAGATGGGACTAAAGTACCTGGAGAAATTGTCGGAGCT 

GACACTTTCTCTGATATTGCTGTCGTCAAAATCTCTTCAGAAAAAGTGACAACAGTAGCTGAGTTTGGTGATTCT 

GTAAGTTAACTGTAGGAGAAACTGCTATTGCCATCGGTAGCCCGTTAGGTTCTGAATATGCAAATACTGTCACTCA 

AGGTATCGTATCCAGTCTCAATAGAAATGTATCCTTAAAATCGGAAGATGGACAAGCTATTTCTACAAAAGCCAT 

CCAAACTGATACTGCTATTAACCCAGGTAACTCTGGCGGCCCACTGATCAATATTCAAGGGCAGGTTATCGGAAT 

TACCTCAAGTAAAATTGCTACAAATGGAGGAACATCTGTAGAAGGTCTTGGTTTCGCAATTCCTGCAAATGATGCT 

ATCAATATTATTGAACAGTTAGAAAAAAACGGAAAAGTGACGCGTCCAGCTTTGGGAATCCAGATGGTTAATTTA 

TCTAATGTGAGTACAAGCGACATCAGAAGACTCAATATTCCAAGTAATGTTACATCTGGTGTAATTGTTCGTTCGG 

T AC A AAGT A AT ATG C CTG CC A ATG GTC ACCTTG A A A A AT ACG ATGT A ATT A C A AAAGT AG ATG AC AAAG AG ATTG 

CTTCATCAACAGACTTACAAAGTGCTCTTTACAACCATTCTATCGGAGACACCATTAAGATAACCTACT 

CGGGAAAGAAGAAACTACCTCTATCAAACTTAACAAGAGTTCAGGTGATTTAGAATCTTAA 

MEANMKHLKTFYKKWFQLLVVIVISFFSGALGSFSITQLTOKSSVNNSNNNSTITQTAYKNENSTTQAVNKVKDAVVSV 

ITYSANRQNSVFGNDDTDTDSQRISSEGSGVIYKKNDKEAYIVTNNHVINGASKVDIRLSDGTKVPGEIVGADTFSD1AV 

VKISSEKVTTVAEFGDSSKLTVGETAIA1GSPLGSEYANTVTQGIVSSLNRNVSLKSEDGQAISTKAIQTDTAINPGNSGGP 
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LINIQGQVIGITSSKIATNGGTSVEGLGFAIPANDAINIIEOLEKNGKVTRPALGIOMVNLSNVSTSDIRRLNIPSNVTSGVIV 
RSVQSNMPANGHLEKYDVITKVDDKEIASSTDLQSALyNHSIGDTIKrrYYRNGKEETTSIKLNKSSGDLESZ 

1D1S 939bo 

ATGGCAGAAATTTATCTAGCAGGTGGTTG 1T1T1GGGGCCTAGAGGAATAIT1TICACGCATTTCTGGAGTGCTAG 
AAACCAGTGTTGGCTACGCTAATGGTCAAGTCGAAACGACCAATTACCAGTTGCTCAAGGAAACAGACCATGCAG 
AAACGGTCCAAGTGATTTACGATGAGAAGGAAGTGTCACTCAGAGAGATTTTACTTTATTATTTCCGAGTTATCGA 
TCCTCTATCTATCAATCAACAAGGGAATGACCGTGGTCGCCAATATCGAACTGGGATTTATTATCAGGATGAAGC 
10 AGATTTGCCAGCTATCTACACAGTGGTGCAGGAGCAGGAACGCATGCTGGGTCGAAAGATTGCAGTAGAAGTGGA 
GCAATTACGCCACTACATTCTGGCTGAAGACTACCACCAAGACT 

ATCGATGTGACCGATGCTGATAAGCCATTGATTGATGCAGCAAACTATGAAAAGCCTAGTCAAGAGGTGTTGAAG 
GCCAGTCTATCTGAAGAGTCTTATCGTGTCACACAAGAAGCTGCTACAGAGGCTCCATTTACCAATGCCTATGACC 
AAACCTTTGAAGAGGGGATTTATGTAGATATTACGACAGGTGAGCCACrCTTTTTTGCCAAGGAT 
15 AGGTTGTGGTTGGCCAAGTTTTAGCCGTCCGATTTCCAAAGAGTTGATTCATTATTACAAGGATCTGAGCCATGGA 
ATGGAGCGAATTGAAGTTCGTTCTCGTTCAGGCAGTGCTCACTTGGGTCATGTTTTCACAGATGGACCGCGGGAGT 
TAGGCGGCCTCCGTTACTGTATCAATTCTGCTTCTTTACGCITTGTGGCCAAGGATGAGATGGAAAAAGCAGGATA 
TGGCTATCTATTGCCTTACTTAAACAAATAA 

20 MAEIYLAGGCFWGLEEYFSRISGVLETSVGYANGQVETTNYQLLKETDHAETVOVIYDEKEVSLREILLYYFRVIDPLSI 
NQQG N D RG RQ YRTG I Y-YQ D E ADL PA I YT-V-V^E^ E RMLG RKI A VE V E° LRH YIL A ED YH° D Y-LRKN PSG YGH ID VTD A 
DKPLIDAANYEKPSQEVLKASl^EESYRVTQEAATEAPFTNAYDQTFEEGIYVDITTGEPL^ 

SKELIHYYKDLSHGMERIEVRSRSGSAHLGHVFTDGPRELGGLRYCINSASLRFVAKDEMEKAGYGYLLPYLNKZ 
25 ID17 870bo 

ATGAAGATTATTGTACCTGCAACCAGTGCCAATATCGGGCCAGGTTTTGACTCGGTCGGTGTAGCTGTAACCAAGT 
ATCITCAAATTGAGGTCrGCGAAGAACGAGATGAGTGGCTGATTGAACACCAGATTGGCAAATGGATTCCACATG 
ACGAGCGTAATCTCTTGCTCAAAATCGCTTTGCAAATTGTACCAGACITGCAACCAAGACGCTTGAAAATGA 
GTGATGTCCCTTTGGCGCGCGGTTTGGGTTCTTCCAGCTCG 

GGGTCAACTCAACTTATCAGACCATGAAAAATTGCAGTTAGCGACCAAGATTGAAGGGCATCCTGACAATGTGGC 
TCCAGCCATTTATGGTAATCTCGTTATTGCAAGTTCTG 
GAGTGTGATTTTCTAGCTTACATTCCAAACTATGAATTACGTACT^ 

TGTCTTATAAGGAAGCTGTTGCTGCAAGTTCTATCGCCAATGTAGCGGTTGCTGCCTTGTTGGCAGGAGACATGGT 
GACCGCTGGGCAAGCAATCGAGGGAGACCTCTTCCATGAGCGCTATCGTCAGGACTTGGTAAGAGAATTTGCGAT 
GATTAAGCAAGTGACCAAAGAAAATGGGGCCTATGCAACCTACCTTTCTGGTGCTGGGCCGACAGTTATGGTTCT 
GGCTTCTCATGACAAGATGCCAACAATTAAGGCAGAATTGGAAAAGCAACCTTTCAAAGGAAAACTGCATGAGTT 
GAGAGTTGATACCCAAGGTGTCCGTGTAGAAGCAAAATAA 

40 MKIIVPATSANIGPGFDSVGVAVTKYLQIEVCEERDEWLIEHQIGKWIPHDERNLLLKIALQIVPDLQPRRLKMTSDVPLA 
RGLGSSSSVIVAGIELANQLGQLNLSDHEKLQLATKIEGHPDNVAPAIYGNLVIASSVEGQVSAIVADFPECDFLAYIPNY 
ELRTRDSRSVLPKKLSYKEAVAASSIANVAVAALLAGDMVTAGQAIEGDLFHERYRQDLVREFAM1KQVTKENGAYAT 
YLSGAGPTVMVLASHDKMPTIKAELEKQPFKGKLHDLRVDTQGVRVEAKZ 

45 ID20 564bp 

ATGAAATATCACGATTACATCTGGGATTTAGGTGGAACTTTACTGGATAATTATGAAACTTCAACAGCTGCATTT 
TTGAAACATTGGCACTGTATGGTATCACACAAGACCATGACAGTGTCTATCAAGCTTTAAAGGTTTCTACTCCT^ 
TGCGATTG AGAC ATTCGCTCCCAATTTAGAGAATTTTTTAGAAAAGTACAAGGAAAATGAAGCCAGAGAGCTTGA 
ACACCCGATTTTATTTGAA GGAG TTTCT GACCT ATTGGAAGACAT 
TCTCATCGAAATGATCAGGTTTTGGAAATTTTAGAAAAAACCTCTATAGCAGCT^ 

CTAGCTCAGGCTTTAAGAGAAAGCCAAATCCCGAATCCATGCTTTATTTAAGAGAAAAGTATCAGATTAGCTCTG 
GTCTTGTCATTGGTGATCGGCCGATTGATATCGAAGCAGGTCAAGCrGCAGGACTTGATACCCACTTGTTTACCAG 
TATCGTGAATTTAAGACAAGTATTAGACATATAA 

MKYHDYIWDLGGTLLDNYETSTAAFVETLALYGITQDHDSVYQALKVSTPFAIETFAPNLENFLEKYKENEARELEHPI 
LFEGVSDLLEDISNQGGRHFLVSHRNDQVLEILEKTSIAAYFTEVVTSSSGFKRKPNPESMLYLREKYQISSGLVIGDRPID 
IEAGQAAGLDTHLFTSIVNLRQVLDIZ 

60 ID21 187Sbp 

ATGACAGAAGAAATCAAAAATCTGCAGGCACAGGATTATGATGCCAGTCAAATTCAAGTTTTAGAGGGCTTAGAG 
GCTGTTCGTATGCGTCCAGGGATGTACATTGGATCAACCrCAAAAGAAGGTCrTCACCATCTAGTCTGGGAAATTG 
TTGATAACTCAATTGACGAGGCCTTGGCAGGATTTGCCAGCCATATTCAAGTTTTTATTGAGCCAGATGATTCGAT 
CO TACTGTTGTGGATGATGGGCGTGGTATCCCAGTCGATATTCAGGAAAAAACAGGCCGTCCTGCTGTTGAGACCGT 
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CTTTACAGTCCTTCACGCTGGAGGAAAGTTCGGCGGTGGTGGATACAAGGTTTCAGGTGGTCTTCACGGGGTGGG 
GTCGTCAGTAGTTAATGCCCTTTCCACTCAATTAGACGTTCATGTTCACAAAAATGGTAAGATTCATTACCAAGAA 
TACCGTCGTGGTCATGTTGTCGCAGATCTTGAAATAGTTGGAGATACGGATAAAACAGGAACAACTGTTCACTTC 
ACACCGGACCCAAAAATCTTCACTGAAACAACAATCTTTGATTTTGATAAATTAAATAAACGGAT^ 
5 GCCTTTCTAAATCGCGGTCTTCAAATTTCAATTACAGATAAGCGCCAAGGTTTGG AACAA ACCAAGCATT 

ATGAAGGTGGGATTGCTAGTTACGTTGAATATATCAACGAGAACAAGGATGTAATCTTTGATACACCAATCTATA 
CAGACGGTGAGATGGATGATATCACAGTTGAGGTAGCCATGCAGTACACAACTGGTTACCATGAAAATGTCATGA 
GTTTCGCCAATAATATTCATACCCATGAAGGTGGAACACATGAACAAGGTTTCCGTACAGCCTTGACACGTGTTAT 
CAACGATTATGCTCGTAAAAATAAGTTACTGAAAGACAATGAAGATAATTTAACAGGGGAAGATGTTCGCGAAGG 
10 CTTAACTGCAGTTATCTCAGTTAAACACCCAAATCCACAGTTTGAAGGACAAACCAAGACCAAATTGGGAAATAG 
CGAAGTGGTCAAGATTACCAATCGCCTCTTCAGTGAAGCTTTCTC 

AAACGTATCGTAGAAAAAGGAATTTTGGCTGCCAAGGCTCGTGTGGCTGCCAAGCGTGCGCGTGAAGTCACACGT 
AAAAAATCTGGTTTGGAAATTTCCAACCTTCCAGGGAAACTAGCAGACT^ 

AACTCTTCATCGTCGAAGGAGACTCAGCTGGTGGATCAGCCAAATCTGGTCGTAACCGTGAGTTTCAGGCTATCCT 
15 TCCAATTCGCGGTAAGATTTTGAACGTTGAAAAAGCAAGTATGGATAAGATTCTAGCCAACGAAGAAATTCGTAG 
TCTTTTCACAGCCATGGGAACAGGATTTGGCGCAGAATTTGATGTTTCGAAAGCCCGTTACCAAAAACT 
ATGACCGATGCCGATGTCGATGGAGCCCACATTCGTACCCTTCTTTTAACCTTGATTTATCGTTATATGAAACCAA 
TCCTAGAAGCTGGTTATGTTTATATTGCCCAACCACCAATCTATGGTGTCAAGGTTGGAAGCGAGATTAAAGAATA 
TATCCAGCCGGGTGCAGATCAAGAAATCAAACTCCAAGAAGCTTTAGCCCGTTATAGTGAAGGTCGTACCAAACC 
20 GACTATTCAGCGTTATAAGGGGCTAGGTGAAATGGACGATCATCAGCTGTGGGAAACAACCATGGATCCCGAACA 
TCGCTTGATGGCTAGAGTTTCTGTAGATGATGTGCAGAAGCAGATAAAATCTTTGATATGTTGA 

MTEEIKNLQAQDYDASQIQVLEGLEAVRMRPGMYIGSTSKEGLHHLVWEIVDNSIDEALAGFASHIQVFIEPDDSITVVD 
DGRGIPVDIQEKTGRPAVETVFTVLHAGGKFGGGGYKVSGGLHGVGSSVVNALSTQLDVHVHKNGKIHYQEYRRGHV 

25 VADLEIVGDTDKTGTTVHFTPDPKIFTETTIFDFDKLNKRIQELAFLNRGLQISITDKRQGLEQTKHYHYEGG1ASYVEYI 
NENKDVIFDTPIYTDGEMDDITVEVAMQYTTGYHENVMSFANNIHTHEGGTHEQGFRTALTRVINDYARKNKLLKDN 
EDNLTGEDVREGLTAVISVKHPNPQFEGQTKTKLGNSEVVKITNRLFSEAFSDFLMENPQIAKRIVEKGILAAKARVAAK 
RAREVTRKKSGLEISNLPGKLADCSSNNPAETELFIVEGDSAGGSAKSGRNREFQAILPIRGKILNVEKASMDKILANEEI 
RSLFTAMGTGFGAEFDVSKARYQKLVLMTDADVDGAHIRTLLLTLIYRYMKPILEAGYVYIAQPPIYGVKVGSEIKEYI 

30 QPGADOEIKLQEALARYSEGRTKPTIQRYKGLGEMDDHQLWETTMDPEHRLMARVSVDDVQKQIKSUCZ 

IDS4 1446bp 

ATGAGTAGAGGTTTTAAAAAATCACGTTCACAGAAAGTGAAGCGAAGTGTTAATATAGTTTTGCTGACTATTTAT^ 
35 TATTGTTAGTTTGTTTTTTATTGTTCTTAATCTTTAAGTACAATATCCTTGCTTTTAGAT 

CTGCGTTAGTCCTACTAGTTGCCTTGGTAGGGCTACTCTTGATTATCTATAAAAAAGCTGAAAAGTTTACT 
CTGTTGGTGTTCTCTATCCTTGTCAGCTCTGTGTCGCTCTTTGCAGTACAGCAGTTTGTTGGA 

AAATGCGACTTCTAATTACTCAGAATATTCAATCAGTGTCGCTGTTTTAGCAGATAGTGAGATCGAAAATGTTACG 
CAACTGACGAGTGTGACAGCACCGACTGGGACTAATAATGAAAATATTCAGAAATTACTAGCTGATATCAAGTCA 

40 AGTCAGAATACCGATTTGACGGTCAACCAGAGTTCGTCTTACTTGGCAGCTTACAAGAGTTTGATTGCAGGGGAG 
ACTAAGGCCATTGTCCTAAATAGTGTCTTTGAAAACATCATCGAGTCAGAGTATCCAGACTACGCATCGAAGATA 
AAAAAGATTTATACTAAGGGATTCACTAAAAAAGTAGAAGCTCCTAAGACGTCTAAGAGTCAGTCTTTCAATATC 
TATGTTAGTGGAATTGACACCTATGGTCCTATTAGTTCGGTGTCGCGATCAGATGTCAACATCCTGATGACTGTCA 
ATCGAGATACCAAGAAAATCCTCTTGACCACAACGCCACGTGATGCCTATGTACCAATCGCAGATGGTGGAAATA 

45 ATCAAAAAGATAAATTGACTCATGCGGGCATTTATGGAGTTGATTCGTCCATTCACACCTTAGAAAATCTCTATGG 
AGTGGATATCAATTACTATGTGCGATTGAACTTCACrrCGTTTTTGAAATTGATTGATTTGTTGGGTGGA 
TTTATAATGATCAAGAATTTACTGCCCATACGAATGGAAAGTATTACCCTGCAGGCAATGTTCATCTTGATTCAGA 
ACAGGCTCTCGGTTTTGTTCGTGAGCGCTACTCCCTAGCAGATGGCGATCGTGACCGCGGGCGCCATCAACAAAA 
GGTGATTGTGGCTATCCTTCAAAAATTAACGTCAACCGAAGTGCTGAAAAATTATAGTACGATCATTAATAGCTTG 

50 CAAGATTCTATCCAAACAAATATGCCACTTGAGACCATGATAAATTTGGTCAATGCTCAGTTAGAAAGTGGAGGG 
AATTATAAAGTAAATTCTCAAGATTTAAAAGGGACAGGTCGGATGGATCTTCCTTCTTATGCAATGCCAGACAGTA 
ACCTCTATGTGATGGAAATAGATGATAGTAGTTTAGCTGTAGTTAAAGCAGCTATACAGGATGTGATGGAGGGTA 
GATGA 

55 MSRRFKKSRSQKVKRSVNIVLLTIYLLLVCFLLFLIFKYNILAFRYLNLVVTALVLLVALVGLLLIIYKlCAEKFriFLLVFS 
ILVSSVSLFAVQQFVGLTNRLNATSNYSEYSISVAVLADSE1ENVTQLTSVTAPTGTNNENIQKLLADIKSSQNTDLTVNQ 
SSSYLAAYKSLIAGETKAIVLNSVFENIIESEYPDYASKIKKIYTKGFTKKVEAPKTSKSQSFNIYVSGIDTYGPISSVSRSDV 
NILMTVNRDTKKILLTTTPRDAYVPIADGGNNQKDKLTHAGIYGVDSSIHTLENLYGVDINYYVRLNFTSFLKLIDLLGG 
IDVYNDQEFTAHTNGKYYPAGNVHLDSEQALGFVRERYSLADGDRDRGRHQQKVIVAILQKLTSTEVLKNYSTIINSLQ 

60 DSIQTNMPLETMINLVNAQLESGGNYKVNSQDLKGTGRMDLPSYAMPDSNLYVMEIDDSSLAVVKAAIQDVMEGRZ 
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ATGATAGACATCCATTCGCATATCGTTTTTGATGTAGATGACGGTCCCAAGTCAAGAGAGGAAAGCAAGGCTCTC 
TTGGCAGAATCCTACAGACAGG GGGTG CGAACCATTGTTTCTACCTCTCACCGTCGCAAGGGCATGTTTGAAACTC 
CGGAAGAGAAGATAGCAGAAAACITTCTTCAGGTTCGGGAAATAGCTAAGGAAGTGGCGAGTGACTTGGTCATTG 
CTTACGGGGCTGAAATTTATTACACACCAGATGTTCTGGATAAGCTGGAAAAAAAGCGGATTCCGACCCTCAATG 
5 ATAGTCGTTATGCCTTGATAGAGTTTAGTATGAACACTCCTTATCGCGATATTCATAGCGCCTTGAGCAAGATCTT 
GATGTTGGGAATTACTCCAGTCATTGCCCACATTGAGCGCTATGATGCTCTTGAAAATAATGAAAAACGCGTTCGA 
GAACTGATCGATATGGGCTGTTACACGCAAGTAAATAGTTCACATGTCCTCAAACCCAAACTTTTTGGCGAACGTT 
ATAAATTCATGAAAAAAAGAGCTCAGTATTTTTTAGAGCAGGATTTGGTTCATGTCATTGCAAGTGATATGCACAA 
TCTAGACGGTAGACCTCCTCATATGGCAGAAGCATATGACCITGTTACCCAAAAATACGGAGAAGCGAAGGCTCA 
1 0 GG AACTTTTTATAG ACAATCCTCG AAAA ATTGTAATGG ATCAACTAATTTAG 

MIDIHSHIVFDVDDGPKSREESKALLAESYRQGVRTIVSTSHIUOCGMFETPEEKIAENFLOVREIAKEVASDLVIAYGAEI 
YYTPDVLDKLEKKRIPTLNDSRYAUEFSMNTPYRDIHSALSKILMLGITPVIAHIERYDALENNEKRVRELIDMGCYTOV 
NSSHVLKPKLFGERYKFMKKRAQYFLEQDLVHVIASDMHNLDGRPPHMAEAYDLVTQKYGEAKAQELFIDNPRKIVM 
15 DQLIZ 

ID58 3990bp 

TTGA TTTATATAATCGCTATCAATATAACAATGCAATCAGGAGGTTTTGCAATGAAACATGAAAAACAACAGCGT 

20 TTTTCTATTCGTAAATACGCTGTAGGAGCAGCTTCTGTTCTAAT^ 

CCG ATjGG AGTTACTCCTACT ACT ACAG A A A A CCA. AXCG AXC A_TCC ATACGGTTTCTG A.TTCCCCTG AjKTGATGGG A 
AAATCGGACTGAGGAAACACCTAAAGCAGTGCTTCAACCAGAAGCTCCAAAAACTGTAGAAACAGAAACTCCAG 
CrACTGATAAGGTAGCTAGTCITCCAAAAACAGAAGAAAAACCACAAGAGGAAGTTAGTTCAACTCCTAGTGATA 
AAGCAGAAGTGGTAACTCCAACTTCTGCTGAAAAAGAAACTGCTAATAAAAAGGCAGAAGAAGCTAGCCCTA^ 

25 AAGGAAGAAGCGAAAGAGGTTGATTCTAAAGAGTCAAATACAGACAAGACTGACAAGGATAAACCAGCTAAAAA 
AGATGAAGCGAAAGCAGAGGCTGACAAACCGGCAACAGAGGCAGGAAAGGAACGTGCTGCAACTGTAAATGAAA 
AACrAGCGAAAAAGAAAATTGTTTCrATTGATGCTGGACGTAAATATTTCrCACCAGAACAGCTCAAGG 
TCGATAAAGCGAAACATTATGGCTACACrGATTTACACCTATTAGTCGGAAATGATGGACTCCGTTTCATGTTGGA 
CGATATGAGCATCACAGCTAACGGCAAGACCTATGCCAGTGACGATGTCAAACGCGCCATTGAAAAAGGTACAAA 

30 TGATTATTACAACGATCCAAACGGCAATCACTTAACAGAAAGTCAAATGACAGATCTGATTAACTATGCCAAAGA 
TAAAGGTATCGGTCTCAT TCCGA CAGT AAAT AGTCCTGGACACATGGATGCGATTCTCAATGCCATGAAAGAATT 
GGGAATCCAAAACCCTAACITTAGCrATTTTGGGAAGAA 
TGTCGCTTTTACAAAAGCCCITATCGACAAGTATGCTGCT 

CITGATGAATATGCCAATGATGCGACAGATGCTAAAGGTTGGAGTGTGCTTCAAGCTGATAAATACTATCCAAAC 
35 GAAGGCTACCCrGTAA AAGGCT ATGAAAAATTTATTGCCTACGCCAATGACCTCGCTCGTATTGTAAAATCGC 
GGTCTCAAACCAATGGCTTTTAACGACGGTATCTACTACAATAGCGACACAAGCTTTGGTAGTTT^ 
ATCATCGTTTCTATGTGGACTGGTGGTTGGGGAGGCTACGATGTCGCTT 

ACCAAATCCTTAATACCAATGATGCITGGTACTACGTTCTTGGACGAAACGCTGATGGCCAAGGCTGGTACAATCT 
CGATCAGGGGCTCAATGGTATTAAAAACACACCAATCACTTCTGTACCAAAAACAGAAGGAGCTGATATCCCAAT 

40 CATCG GTGG TATGGTAGCTGCTTGGGCTGACACTCCATCrGCACGTTATTCACCATCACGCCTCTTCAAACT 

CGTCATTTTGCAAATGCCAACGCTGAATACTTCGCAGCTGATTATGAATCTGCAGAGCAAGCACTTAACGAGGTA 
CCAAAAGACCrGAACCGTTATACTGCAGAAAGCGTCACGGCCGTAAAAGAAGCTGAAAAAGCTATTCGCTCTCTC 
GATAGCAACCITAGCCGTGCCCAACAAGATACGATTGATCAAGCCATTGCTAAACTTCAAGAAACrGTCAACAAC 
TTGACCCTCACGCCTGAAGCTCAAAAAGAAGAAGAAGCTAAACGTGAGGTTGAAAAACTTGCCAAAAACAAGGT 

45 AATCTCAATCGATGCTGGACGCAAATACnTTACTCTGAACCAGCTCAAACGCATCGTAGACAAGGCCAGTGAGCT 
CGGATATTCTGATGTCCATCTCCrTCrAGGAAATGACGGACTTCGCrTTCTACTCGATGATATGACCATTACTGCC 
AACGGAAAAACCTATGCTAGTGATGACGTTAAAAAAGCTATTATCGAAGGAACTAAAGCTTACTACGACGATCCA 
AACGGTACTGCACTAACACAGGCAGAAGTAACAGAGCTAATTGAATACGCTAAATCTAAGGACATCGGTCTCATC 
CCAG CTATT AACAGTCCAGGTCACATGGATGCTATGCTGGTTGCCATGGAAAAATTAGGTATTAAAAATCCTCAA 

5U GCCCACTTTGATAAAGTTTC AAAA ACAACTATGGACTTGAAAAACGAAGAAGCGATGAACyTTGTAAAAGCCCT 
ATCGGTAAATACATGGACTTCITTGCAGGTAAAACAAAGATTTTCAACTTTGGTACTGACGAATACGCCAACGAT 
GCGACTAGTGCCCAAGGCTGGTACTACCTCAAGTGGTATCAACTCTATGGCAAATTTGCCGAATATGCCAACACC 
CTCGCAGCTATGGCCAAAGAAAGAGGGCTTCAACCAATGGCCITCAACGATGGCnTCTACTATGAAGACAAGGAC 
GATGTTCAGTTTGACAAAGATGTCTTGATTTCTTACTGGTCT 

55 AATACCTAGCAAGCAAAGGCTATAAATTCITGAATACCAACGGTGACTGGTACTACATTCTTGGTCAAAAACCAG 
AAGATGGTGGTGGTTTCCTCAAGAAAGCTATTGAGAATACrGGAAAAACACCATTCAATCAACrAGCTTCTACCA 
AATATCCTGAAGT AGATC nTCCAACAGTCGGAAGTATGCTTTCAATCTGGGCAGATAGACCAAGCGCTGAATACA 
AGGAAGAGGAAATCnTGAACTCATGACTGCCTTTGCAGACCACAACAAAGACrACITTCGTGCT 
CrCrCCGCGAAGAATTAGCrAAAATTCCTACAAACrTAGAAGGATATAGTAAAGAAAGTCTTGAGGCCCTTGACG 

OU CAGCTAAAACAGCTCTAAATTACAACCTCAACCGTAj\TAAACAAGCrGAGCTTGACACGCrTGTAGCCAA 

AAGCCGCrCTTCAAGGCCrCAAACCAGCTGTAACTCATTCAGGAAGCCTAGATGAAAATGAAGTGGCTGCCAATG 
TTGAAACCAGACCAGAACTCATCACAAGAACTGAAGAAATTCCATTTGAAGTTATCAAGAAAGAAAATCCTAACC 
TCCCAGCCGGTCAGGAAAATATTATCACAGCAGGAGTCAAAGGTGAACGAACTCATTACATCTCTGTACTCACTG 
AAAATGGAAAAACAACAGAAACAGTCCTTGATAGCCAGGTAACCAAAGAAGTTATAAACCAAGTGGTTGAAGTT 

OD GGCGCTCCTGTAACTCACAAGGGTGATGAAAGTGGTCITGCACCAACTACTGAGGTAAAACCTAGACTGGATATC 
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CAAGAAGAAGAAATTCCATTTACCACAGTGACTTGTGAAAATCCACTCTTACTCAAAGGAAAAACACAAGTCATT 
ACTAAGGGCGTCAATGGACATCGTAGCAACTTCTACTCTGTGAGCACTTCTGCCGATGGTAAGGAAGTGAAAACA 
CTTGTAAATAGTGTCGTAGCACAGGAAGCCGTTACTCAAATAGTCGAAGTCGGAACTATGGTAACACATGTAGGC 
GATGAAAACGGACAAGCCGCTATTGCTGAAGAAAAACCAAAACTAGAAATCCCAAGCCAACCAGCTCCATCAAC 
5 TGCTCCTGCTGAGGAAAGCAAAGTTCTTCCrCAAGATCCAGCTCCTGTGGTAACAGAGAAAAAACTTCCT 

AGGAACTCACGATTCTGCAGGACTAGTAGTCGCAGGACTCATGTCCACACTAGCAGCCTATGGACTCACTAAAAG 
AAAAGAAGACTAA 

MIYIIAINrTMQSGGFAMKHEKQQRFSIRKYAVGAASVLIGFAFQAQTVAADGVTPTTTENQPTIHTVSDSPQSSENRTEE 

10 TPKAVLQPEAPKTVETETPATDKVASLPKTEEKPQEEVSSTPSDKAEVVTPTSAEKETANKKAEEASPKKEEAKEVDSKE 
SOTDKTDKDKPAKKDEAKAEADKPATEAGKERAATVNEKLAKKKIVSIDAGRKYFSPEQLKEIIDKAKHYGYTDLHLL 
VGNDGLRFMLDDMSITANGKTYASDDVKRAIEKGTNDYYNDPNGNHLTESQMTDLINYAKDKGIGLIPTVNSPGHMD 
AILNAMKELG10NPNFSYFGKKSARTVDLDNEQAVAFTKALIDKYAAYFAKKTEIFNIGLDEYANDATDAKGWSVLQA 
DKYYPNEGYPVKGYEKFIAYANDLARIVKSHGLKPMAFNDGIYYNSDTSFGSFDKDIIVSMWTGGWGGYDVASSKLLA 

15 EKGHQILOTNDAWYYVLGRNADGQGWYNLDQGLNGIKOTPITSVPKTEGADIPnGGMVAAWADTPSARYSPSRLFKL 
MRHFANANAEYFAADYESAEQALNEVPKDLNRYTAESVTAVKEAEKAIRSLDSNLSRAQQDTIDQAIAKLQETVNNLT 
LTPEAQKEEEAKREVEKLAKNKVISIDAGRKYFTLNQLKRTVDKASELGYSDVHLLLGNDGLRFLLDDMTITANGKTYA 
SDDVKKAIIEGTKAYYDDPNGTALTQAEVTEUEYAKSKDIGLIPAINSPGHMDAMLVAMEKLGIKNPQAHFDKVSKTT 
MDLKNEEAMNFVKAL1GKYMDFFAGKTKIFNFGTDEYANDATSAQGWYYLKWYQLYGKFAEYANTLAAMAKERGL 

20 QPMAFNDGFYYEDKDDVQFDKDVLISYWSKGWWGYNI^\SPQYLASKGYKFLNTNGDWYYILGQKPEDGGGFLKKAI 
ENTGKTPFNQLASTKYPEVDLPTVGSMI^IWADRPSAEYKEEEIFELMTAFADHNKDYFRANYNALREELAKIPTNLEG 
YSKESLEALDAAKTALNYNLNRNKQAELDTLVANLKAALQGLKPAVTHSGSLDENEVAANVETRPELITRTEEIPFEVI 
KKENPNLPAGQENHTAGVKGERTHYISVLTENGKTTETVLDSOVTKEVINQVVEVGAPVTHKGDESGLAPTTEVKPRL 
DIQEEEIPFTTVTCENPLLLKGKTQVrnCGVNGHRSNFYSVSTSADGKEVIO'LVNSVVAQEAVTQIVEVGTMVTHVGDE 

25 NGQAAIAEEKPKLEIPSQPAPSTAPAEESKVLPQDPAPVVTEKKLPETGTHDSAGLWAGLMSTLAAYGLTKRKEDZ 

ID122 825bp 

ATGAACAAAAAAACAAGACAGACACTAATCGGACTGCTAGTGTTATTG 

30 AAGCAGATGCCGTCGGCACCTAATAGTCCCAAAACCAATCTTAGTCAGAAAAAACAAGCGTCTGAAGCTCCTAGT 
CAAGCATTGGCAGAGAGTGTCTTAACAGACGCAGTCAAGAGTCAAATAAAGGGGAGTCTGGAGTGGAATGGCTC 
AGGTGCTTTTATCGTCAATGGTAATAAAACAAATCTAGATGCCAAGGTTTCAAGTAAGCCCTACGCTGACAATAA 
AACAAAGACAGTGGGCAAGGAAACTGTTCCAACCGTAGCTAATGCCCrCTTGTCTAAGGCCACTCGTCAGTACAA 
GAATCGTAAAGAAACTGGGAATGGTTCAACTTCTTGGACTCCTCCAGGTTGGCATCAGGTCAAGAATCTAA 

35 CTCTTATACCCATGCAGTCGATAGAGGTCATTTGTTAGGCrATGCCITAATCGGTGGTTTGGATG U riMM'GATGCCT 
CAACAAGCAATCCTAAAAACATTGCTGTTCAGACAGCCTGGGCAAATCAGGCACAAGCCGAGTATTCGACTGGTC 
AAAACTACTATGAAAGCAAGGTGCGTAAAGCCTTGGACCAAAACAAGCGTGTCCGTTACCGTGTAACCCTTTACT 
ACGCTTCAAACGAGGATTTAGTTCCCTCAGCTTCACAGATtGAAGCCAAGTCTTCGGATGGAGAATTGGAA 
ATGTTCTAGTTCCCAATGTTCAAAAGGGACTTCAACTGGATTACCGAACTGGAGAAGTAACTGTAACTCAGTAA 

40 

MNKKTRQTLIGLLVLLLLSTGSYYIKQMPSAPNSPKTNLSQKKQASEAPSQALAESVLTDAVKSQIKGSLEWNGSGAFIV 
NGNKTNLDAKVSSKPYADNKTKTVGKETVPTVANALLSKATRQYKNRKETGNGSTSWTPPGWHQVKNLKGSYTHAV 
DRGHLLGYALIGGLDGFDASTSNPKNIAVQTAWANQAQAEYSTGQNYYESKVRKALDQNKRVRYRVTLYYASNEDL 
^ VPSASQIEAKSSDGELEFNVLVPNVQKGLQLDYRTGEVTVTQZ 

1D123 225bp 

GTGCTAAGATTCAGCGGATTGAGGCAAGTGATGAAGATGAATAAGAAATCAAGCTACGTAGTCAAGCGTTTACTT 
TTAGTCATCATAGTACTGATTTTAGGTACTCTGGCTCTAGGAATCGGTTTAATGGTAGGTTATGGAATCTTGGGCA 
50 AGGGTCAAGATCCATGGGCTATCCTGTCTCCAGCAAAATGGCAGGAATTGATTCATAAATTTACAGGAAATTAG 

VLRFSGLRQVMKMNKKSSYVVKRLLLVIIVLILGTLALGIGLMVGYGILGKGQDPWAILSPAKWQELIHKFTGNZ 



55 
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CLAIMS: 

1 . A Streptococcus pneumoniae protein or polypeptide having a sequence 
selected from those shown in table 1 . 

5 

2. A Streptococcus pneumoniae protein or polypeptide having a sequence 
selected from those shown in table 2. 

3. A protein or polypeptide as claimed in claim 1 or claim 2 provided in 
10 substantially pure form. 

4. A protein or polypeptide which is substantially identical to one defined in any 
one of claims 1 to 3. 

15 5. A homologue or derivative of a protein or polypeptide as- defined in any one of 
claims 1 to 4. 

6. An antigenic and/or immunogenic fragment of a protein or polypeptide as 
defined 

20 in Tables 1-3. 

7. A nucleic acid molecule comprising or consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 1 or their RNA equivalents; 

25 

(ii) a sequence which is complementary to any of the sequences of (i); 
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(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which is substantially identical with any of those of (i), (ii) 
and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 1 . 

8. A nucleic acid molecule comprising or consisting of a sequence which is: 



(i) any of the DNA sequences set out in Table 2 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which is substantially identical with any of those of (i), (ii) 
and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 2. 



9. The use of a protein or polypeptide having a sequence selected from those 
shown in Tables 1-3, or homologues, derivatives and/or fragments thereof, as an 
immunogen and/or antigen. 
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10. An immunogenic and/or antigenic composition comprising one or more 
proteins or polypeptides selected from those whose sequences are shown in Tables 1- 
3, or homologues or derivatives thereof, and/or fragments of any of these. 

11. An immunogenic and/or antigenic composition as claimed in claim 10 which 
is a vaccine or is for use in a diagnostic assay. 

12. A vaccine as claimed in claim 11 which comprises one or more additional 
components selected from excipients, diluents, adjuvants or the like. 

13. A vaccine composition comprising one or more nucleic acid sequences as 
defined in Tables 1-3. 

14. A method for the detection/diagnosis of S. pneumoniae which comprises the 
step of bringing into contact a sample to be tested with at least one protein or 
polypeptide as defined in Tables 1-3, or homologue, derivative or fragment thereof. 

15. An antibody capable of binding to a protein or polypeptide as defined in 
Tables 1-3, or for a homologue, derivative or fragment thereof. 

16. An antibody as defined in claim 15 which is a monoclonal antibody. 

17. A method for the detection/diagnosis of S. pneumoniae which comprises the step 
of bringing into contact a sample to be tested and at least one antibody as define din 
claim 15 or claim 16. 

18. A method for the detection/diagnosis of S. pneumoniae which comprises the 
step of bringing into contact a sample to be tested with at least one nucleic acid 
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sequence as defined in claim 7 or claim 8. 

19. A method of determining whether a protein or polypeptide as defined in 
Tables 1-3 represents a potential anti-microbial target which comprises inactivating 
said protein or polypeptide and determining whether S. pneumoniae is still viable. 

20. The use of an agent capable of antagonising, inhibiting or otherwise 
interfering with the function or expression of a protein or polypeptide as defined in 
Tables 1-3 in the manufacture of a medicament for use in the treatment or 
prophylaxis of S. pneumoniae infection 
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